rdfjs / N3.js

Lightning fast, spec-compatible, streaming RDF for JavaScript
http://rdf.js.org/N3.js/
Other
676 stars 127 forks source link

Perf/better quad ids #318

Closed jeswr closed 1 year ago

jeswr commented 1 year ago

This is a draft PR that branches on #311 and improves the indexing of quads in the store by using the existing numeric id's of other terms to generate the id of the quad as mentioned in https://github.com/rdfjs/N3.js/pull/311#discussion_r1061131493.

Note there have been some changes to #311 since this was first opened so merge with care

jeswr commented 1 year ago

~Doesn't actually seem to have much of an impact~ This has a reasonable impact - with the script I just added we get

Main

N3Store performance test
- Adding 5153632 triples to the default graph: 4.302s
* Memory usage for triples: 363MB
- Finding all 5153632 triples in the default graph 484 times (0 variables): 12.121s
- Finding all 10648 triples in the default graph 968 times (1 variable subject): 2.264s
- Finding all 0 triples in the default graph 968 times (1 variable predicate): 0.613ms
- Finding all 22 triples in the default graph 1936 times (1 variable predicate): 1.256s
- Finding all 0 triples in the default graph 968 times (1 variable object): 1.331ms
- Finding all 22 triples in the default graph 1936 times (1 variable objects): 1.268s
- Finding all 484 triples in the default graph 484 times (2 variables): 608.382ms

Here

N3Store performance test
- Adding 5153632 triples to the default graph: 3.358s
* Memory usage for triples: 357MB
- Finding all 5153632 triples in the default graph 484 times (0 variables): 10.789s
- Finding all 10648 triples in the default graph 968 times (1 variable subject): 2.121s
- Finding all 0 triples in the default graph 968 times (1 variable predicate): 0.691ms
- Finding all 22 triples in the default graph 1936 times (1 variable predicate): 1.071s
- Finding all 0 triples in the default graph 968 times (1 variable object): 0.528ms
- Finding all 22 triples in the default graph 1936 times (1 variable objects): 1.066s
- Finding all 484 triples in the default graph 484 times (2 variables): 570.413ms
jeswr commented 1 year ago

With the existing N3Store-perf.js there is a bit of a performance hit so it may be worth optimizing the function calls a bit, or using a parameter to change the way that we do quad ids.

Main

$ node perf/N3Store-perf.js 128
N3Store performance test
- Adding 2097152 triples to the default graph: 745.913ms
* Memory usage for triples: 150MB
- Finding all 2097152 triples in the default graph 16384 times (0 variables): 3.333s
- Finding all 2097152 triples in the default graph 32768 times (1 variable): 647.21ms
- Finding all 2097152 triples in the default graph 49152 times (2 variables): 582.631ms

- Adding 1048576 quads: 473.093ms
* Memory usage for quads: 124MB
- Finding all 1048576 quads 131072 times: 448.73ms
N3 Store tests for sparsely connected entities
- Adding 1048576 with all different IRIs: 3.513s
* Retrieving all 1048576 quads: 611.111ms
* Retrieving single by subject: 1.535s
* Retrieving single by predicate: 1.523s
* Retrieving single by object: 1.904s
* Retrieving single by subject-predicate: 2.210s
* Retrieving single by subject-object: 2.043s
* Retrieving single by predicate-object: 2.087s
* Retrieving single by subject-predicate-object: 1.721s

Here

$ node perf/N3Store-perf.js 128
N3Store performance test
- Adding 2097152 triples to the default graph: 768.303ms
* Memory usage for triples: 150MB
- Finding all 2097152 triples in the default graph 16384 times (0 variables): 3.478s
- Finding all 2097152 triples in the default graph 32768 times (1 variable): 725.156ms
- Finding all 2097152 triples in the default graph 49152 times (2 variables): 689.219ms

- Adding 1048576 quads: 470.674ms
* Memory usage for quads: 124MB
- Finding all 1048576 quads 131072 times: 506.057ms
N3 Store tests for sparsely connected entities
- Adding 1048576 with all different IRIs: 3.514s
* Retrieving all 1048576 quads: 627.364ms
* Retrieving single by subject: 1.592s
* Retrieving single by predicate: 1.591s
* Retrieving single by object: 1.569s
* Retrieving single by subject-predicate: 1.806s
* Retrieving single by subject-object: 1.782s
* Retrieving single by predicate-object: 1.817s
* Retrieving single by subject-predicate-object: 1.771s
jeswr commented 1 year ago

For N3StoreStarViews-perf.js

Main

N3Store performance test
- Adding 1073741824 triples to the default graph: 2.686s
* Memory usage for triples: 584MB
- Finding all 1073741824 triples in the default graph 4096 times (0 variables): 5.680s
- Finding all 262144 triples in the default graph 8192 times (1 variable subject): 1.876s
- Finding all 0 triples in the default graph 8192 times (1 variable predicate): 1.118ms
- Finding all 3 triples in the default graph 786432 times (1 variable predicate): 2.395s
- Finding all 0 triples in the default graph 8192 times (1 variable object): 1.386ms
- Finding all 3 triples in the default graph 786432 times (1 variable objects): 2.384s
- Finding all 9 triples in the default graph 262144 times (2 variables): 1.088s

Here

N3Store performance test
- Adding 1073741824 triples to the default graph: 2.370s
* Memory usage for triples: 554MB
- Finding all 1073741824 triples in the default graph 4096 times (0 variables): 5.102s
- Finding all 262144 triples in the default graph 8192 times (1 variable subject): 1.803s
- Finding all 0 triples in the default graph 8192 times (1 variable predicate): 8.72ms
- Finding all 3 triples in the default graph 786432 times (1 variable predicate): 2.378s
- Finding all 0 triples in the default graph 8192 times (1 variable object): 5.433ms
- Finding all 3 triples in the default graph 786432 times (1 variable objects): 2.354s
- Finding all 9 triples in the default graph 262144 times (2 variables): 1.032s