Closed jcoyne closed 9 years ago
It's certainly slower thank N-Triples, as the LL(1) nature of the oaraer introduces much more overhead. A hand-built parser would certainly be built faster, but the consensus was that sticking to LL(1) was worth it.
If you have some thoughts on specific speed-ups, let me know. A custom parser isn't out of the question, but would be a fair amount of work.bote that there is a Freebase-style reader which is quite fast, but uses a constrained syntax.
FYI, I did a PEG parser (using Treetop) for N3 some time ago. It performs reasonably well, although the details of working with that parser aren't ideal IIRC. The main problem is that it works by constructing a parse tree, and then providing the parse tree after parsing is complete, which can be navigated to generate triples. This only works for relatively bound input files. The LL(1) parser will handle input of arbitrary size, albeit at a slower parse rate.
If someone were interested in creating a more optimal parser, it would be reasonable to use it as the default, and provide an option for running the LL(1) parser if necessary.
So, this is on my radar now.
The updated parser in the new-parser branch is about 3x faster running examples/sp2b.ttl ~50K triples. Let me know what you think. I'm not sure how to make it much faster, without using a limited syntax, similar to the Freebase parser.
Before it can be merged to develop, the TriG parser will need to be updated too.
Released in Release 1.1.7.
Huge improvement! :clap: :rocket:
where out.ttl is: https://gist.github.com/jcoyne/5d153463e1379ac2324b