Closed bergos closed 8 years ago
I know that @mcollina prefers streams and callbacks while @nathan7 has tons of experience with wrapping streams and callbacks with promises. Could you guys imagine helping us with reviewing this proposal? :bow:
@elf-pavlik It's not exactly clear what the API of AbstractParser is. Can you please fill in the READMEs/docs of those, so I can understand better? Moreover, what is the question?
Also, this is wrong: https://github.com/rdf-ext/rdf-parser-abstract/blob/master/index.js#L55-L81, the 'data'
event should not be used, and definitely not accumulated in that way. Use concat-stream, or bl.
@mcollina Thanks for your feedback. I changed already the code to use concat-stream, like you proposed.
The described API will be used in more parser implementations, so we want to be sure it's designed the right way. I tried to offer all common ways to handle async calls and streaming data. The Readme contains now also some more details. What do you think about it?
The method implementations in AbstractParser are just fallbacks for parsers that don't support streaming. So just ignore that it's not nice to concat all the data of the stream...
I would move the base, filter, graph
properties to the constructor. These are not likely to change in-between parsing, and they might be changed via properties anyway.
Regarding streaming, the big issue I am seeing is that this is not a "streaming parser", meaning you need to wait reading the whole document before importing it. This is an important use case you should support for large datasets.
See @RubenVerborgh https://github.com/RubenVerborgh/N3.js for an example of a streaming parser.
I think base
will change nearly every time the parser is called. The same graph
object will only be used to merge multiple sources into one graph. So I expect that property will also change most of the time. I expect filter
will be used only in cases with a lot of triples, but also in that case the filter could change between the calls to the parse methods.
I know that it's not a real streaming parser at the moment, but in the future parser implementations should replace the method of AbstractParser with code that does real streaming.
The combination of callbacks and Promises is OK from your point of view?
Must be defined in the RDFJS Representation Task Force spec.
The JSON-LD parser module, based on AbstractParser, implements a new API. If we can agree on the new API I will create modules for all other parser, currently bundled with RDF-Ext.
Promise process (any toparse, optional ProcessorCallback callback, optional DOMString base, optional TripleFilter filter, optional SuccessCallback done)
.process
returns now aPromise
object. For asynchronous parsers the boolean value was useless anyway. Both,done
andPromise.resolve
, are called.done
can benull
.toparse
can be aStream
. The parser detects if thetoparse
value is aStream
object. The rest works like defined in the RDF-Interfaces/RDF-Ext spec.Promise parse (any toparse, optional ParserCallback? callback, optional DOMString base, optional TripleFilter filter, optional Graph graph)
Also
.parse
returns now aPromise
object like.process.
Both,callback
andPromise.resolve
, are called.callback
can be null.toparse
can be aStream
. The parser detects if thetoparse
value is aStream
object. The rest works like defined in the RDF-Interfaces/RDF-Ext spec.ReadableStream stream (Stream inputStream, optional DOMString base, optional TripleFilter filter)
The
.stream
method is new. Thedata
,end
anderror
event is used.N3.js
uses also aprefix
event. This is currently not implemented. Do we need this? Everything else works as usual.