google / sling

SLING - A natural language frame semantics parser
Apache License 2.0
1.93k stars 268 forks source link

Fast snapshot GC #405

Closed ringgaard closed 5 years ago

ringgaard commented 5 years ago

The frame store snapshots made it fast to load large frames stores, but it was slow to freeze because the GC of the large global store. I have changed the snapshot format so it can now load the snapshot into "frozen" heaps. These are then read-only and all the objects have the mark bit set, so now GC takes less than 0.5 seconds.

I had to change the object layout in the frame store. One benefit of this is that a store can now have twice as many objects (512M to 1024M objects).

I have also implemented support for reading frames in RDF Turtle format, which we will need for Wikidata lexicographic data.

ringgaard commented 5 years ago

Thanks for the review. I have also added a better solution for the token type enum.