rdfjs / N3.js

Lightning fast, spec-compatible, streaming RDF for JavaScript
http://rdf.js.org/N3.js/
Other
676 stars 127 forks source link

Feat: Extract `EntityIndex` class #289

Open jeswr opened 2 years ago

jeswr commented 2 years ago

Would look something like

import { default as N3DataFactory, termToId, termFromId } from './N3DataFactory';

type Term = any;

interface EntityIndex<ID, T extends Term = Term> {
  termToId(term: Term): ID | undefined;
  termToIdSafe(term: Term): ID;
  idToTerm(id: ID): T | undefined;
}

class N3EntityIndex implements EntityIndex<number> {
  private ids: Record<number, Term> = {};
  private entities: Record<Term, number> = {};
  private id = 0;
  private factory;

  termToId(term: Term): number | undefined {
    term = termToId(term);
    return this.entities[termToId(term)];
  }

  termToIdSafe(term: Term): number | undefined {
    term = termToId(term);
    return this.entities[term] || (this.ids[this.entities[++this.id] = term]   = this.id);
  }

  idToTerm(id: number): Term {
    return termFromId(this.ids[id], this.factory);
  }
}

The point of this is to enable multiple sources to share the same EntityIndex for memory/performance reasons.

Pinging @jacoscaz who may have already done something similar with scoping in quadstore.

rubensworks commented 2 years ago

A big 👍 from me on this. Would be crucial for things like https://github.com/comunica/comunica/issues/873.

An alternative approach could be to store this id value within the RDF/JS Term, but not sure which approach would be more convenient.

jeswr commented 2 years ago

allternative approach could be to store this id value within the RDF/JS Term

My main concern with this approach would be having to handle conflicting ID's if there are multiple sources.

rubensworks commented 2 years ago

My main concern with this approach would be having to handle conflicting ID's if there are multiple sources.

Indeed, we'll need a way of detecting distinct sources if we'd go this route (not saying we should). (I did something similar to that here: https://rdfostrich.github.io/article-jws2018-ostrich/#dictionary)