Union Types are Hard to Use

thatcort commented 4 years ago

Apologies if this issue is due to not understanding how the library works, but I'm having trouble understanding how the type system works:

The method Planbuilder.build has a return signature of PipelineStage<QueryOutput> | Consumable. This means that calling code can't know in advance if the returned object is subscribable or a promise. Similarly, QueryOutput is defined as the union Bindings | Algebra.TripleObject | boolean, so again calling code can't know what to expect.

This results in awkward code that needs to test the return types before using it and prevents the IDE from suggesting completions:

import { Graph, Dataset, HashMapDataset, PlanBuilder, PipelineStage, Bindings } from 'sparql-engine';
import { Consumable } from 'sparql-engine/dist/operators/update/consumer';
import { QueryOutput } from 'sparql-engine/dist/engine/plan-builder';
import { Algebra } from 'sparqljs';

    const query = `
      PREFIX dc: <http://purl.org/dc/elements/1.1/>
      INSERT DATA { <http://example/book1>  dc:title  "Fundamentals of Compiler Design" }`;
    const output = builder.build(query);
    if ('subscribe' in output) {
      (output as PipelineStage<QueryOutput>).subscribe((value: QueryOutput) => {
        if (value instanceof Bindings) {
          value.forEach((variable, val) => console.log(`${variable}: ${val}`));
        } else if (typeof value !== 'boolean') {
          const t = value as Algebra.TripleObject;
          console.log(`Triple: ${t.subject} ${t.predicate} ${t.object}`);
        } else { // boolean result
          console.log(`Boolean result: ${value}`);
        }
      }, console.error, () => {});
    } else {
      (output as Consumable).execute().then(result => {
        console.log('Query completed. No result returned');
      });
    }

Also, many of those imports aren't exported by sparql-engine, so have to be imported from specific files in the dist directory. Is this intentional?

Finally, I'm wondering why the RXJS interfaces aren't used directly? E.g. PipelineOutput could implement Subscribable or just be an rxjs Observable. Then it would be pipeable to other rxjs operators.

Callidon commented 4 years ago

Hi

I agree that the Union type PipelineStage<<QueryOutput> | Consumable> is not very user-friendly at first glance. However, it makes sense when you consider the expected evaluation results for all different types of SPARQL queries:

SELECT queries only produce solution bindings (Bindings)
ASK queries only produce a boolean result.
DESCRIBE and CONSTRUCT queries only produce RDF triples.
SPARQL UPDATE queries (INSERT, DELETE, DELETE/INSERT) produce no results, they just need to be executed atomically, hence the use of a Consumable.

So, as you know which query you want to execute, you know in advance what the pipeline is going to produce and you can pass this information to the Typescript type system. For example, with a SELECT query, you can simply cast the output of builder.build(query) to PipelineStage<Bindings>as follows:

const query = `
  PREFIX dc: <http://purl.org/dc/elements/1.1/>
  SELECT * WHERE { ?book  dc:title  ?title }
`
const output = builder.build(query) as PipelineStage<Bindings>
output.subscribe((value: Bindings) => {
  console.log(value.toString())
}, err => {
  console.error(err)
}, () => {
  console.log('Query execution completed!')
})

Such code is type-safe since you know that, according to the SPARQL specification, the evaluation of a SELECT query only produces solution bindings. Hence, the PlanBuilder class will always returns a PipelineStage<Bindings> to evaluate it. The same principle can be generalized for all the other types of SPARQL queries.

I hope it's more intelligible but if you have more questions, feel free to ask them! I'm always looking for feedback on this work!

PS: Sorry for the slow reply, I've just come back from the holidays.

Callidon commented 4 years ago

I will close this issue due to the lack of feedback.

thatcort commented 4 years ago

Hi, saw the notification about closing the issue. Thanks for the explanation. One thought I had was that you could parameterize the pipeline builder by query type, and that would specify the return type. Something like (pseudo code):

builder.build(...): Bindings builder.build(...): boolean etc.

Then the IDE would be able to disambiguate the return type.

Callidon / sparql-engine

Union Types are Hard to Use #35