gchq / Gaffer

A large-scale entity and relation database supporting aggregation of properties
Apache License 2.0
1.75k stars 353 forks source link

Improve Kryo serialisation in spark-library #803

Closed gaffer01 closed 7 years ago

gaffer01 commented 7 years ago

Currently only Entity, Edge and Properties are registered in the Registrator class in the spark library. We should register other classes such as FreqMap, HyperLogLogPlus and the other sketches. We should provide implementations of the serialisers for classes that may be difficult/expensive to serialise (e.g. HyperLogLogPlus).

We should also investigate registering other classes such as Schema, Graph, Store and StoreProperties. In order to serialise Schema, we need to use the de.javakaffee:kryo-serialisers project, and call UnmodifiableCollectionsSerializer.registerSerializers in the registerClasses method.

gaffer01 commented 7 years ago

Updating this issue to just do FreqMap and HyperLogLogPlus. Using Kryo serialisation for some of the sketches from the DataSketches library needs more work, as classes like DoublesUnion are abstract and the actual concrete implementations of them are not publicly accessible from outside the project. A separate issue will be raised for this.

A separate issue will also be raised for Kryo serialisation of Schema and other properties.

p013570 commented 7 years ago

Merged into develop.