rdfjs / stream-spec

RDF/JS: Stream interfaces – A specification of a low level interface definition representing RDF data independent of a serialized format in a JavaScript environment.
https://rdf.js.org/stream-spec/
5 stars 2 forks source link

Inconsistency in Stream definition and how other interfaces use it #2

Closed rubensworks closed 5 years ago

rubensworks commented 5 years ago

Currently, the Sink interface has a single method: EventEmitter import(Stream stream); In the description, parsers are mention as a typical use case of this, as they can accept a Node.js stream.

However, I think the WebIDL definition is wrong, as it explicitly refers to the Stream interface as accepted argument. Stream is however defined to only work with Quad data elements. As such, I think this argument's type should be EventEmitter instead. So the signature would become: EventEmitter import(EventEmitter stream); (FYI, this problem becomes concrete when working with the TypeScript definitions.)

Alternatively, we could create a separate interface for parsers, and keep the current (strict) interface for serializers and stores.

bergos commented 5 years ago

Can we define the Stream interface more open? In Typical use cases we have Stream<Quad> for quad streams and Stream for any kind of stream. Or should we define QuadStream as a typed stream? The Stream interface also defines read() and some events. I don't think it makes sense to pass a plain EventEmitter to Sink.import, it will be always a Stream of any type.

rubensworks commented 5 years ago

Can we define the Stream interface more open?

Sure, I think that makes sense!

elf-pavlik commented 5 years ago

This came up again in https://github.com/rdfjs/stream-spec/pull/10#discussion_r263373003

@rubensworks @RubenVerborgh do you have some specific proposal which could manifest in PR?

elf-pavlik commented 5 years ago

from: https://github.com/rdfjs/stream-spec/issues/3#issue-408891796 by @RubenVerborgh

We currently introduce Stream with Quad read();, but further down we're talking about Stream<Quad> and Stream, the latter being a stream of strings.

rubensworks commented 5 years ago

I don't have a concrete suggestion for a PR at the moment. If no one else does, I can look at coming up with a proposal.

elf-pavlik commented 5 years ago

I think http://rdf.js.org/stream-spec/#typical-use-cases comes helpful to show that possibly only parsers and serializes need Stream.

Maybe we can just adjust definition of Stream#read to (string or Quad) read(); and adjust description of data event?

I think we should also for Source#match and Store#remove use Stream<Quad>. Then just for Source#import use (Stream or Stream).

Other direction seems like defining StringStream and QuadStream, but I don't see if it has any advantages over just using Stream<string> and Stream<Quad>.

BTW how would you define it for TypeScript?

bergos commented 5 years ago

👍 for adjusting it, but string should be any. Binary formats may use Buffer and it can be also used to handle JSON like plain JS objects (e.g. flat JSON-LD without the array around, just the objects for each quad).

elf-pavlik commented 5 years ago

@bergos it seems that you have pretty clear idea for how to adjust it, would you like to make PR?

bergos commented 5 years ago

I can do it, beginning of next week.