takawitter / trie4j

PATRICIA, Double Array, LOUDS Trie implementations for Java
Apache License 2.0
174 stars 31 forks source link

implements Serializable #4

Closed fedorn closed 10 years ago

fedorn commented 10 years ago

Why your trie classes doesn't implements Serializable? This would be very helpful.

takawitter commented 10 years ago

Right, but we have to consider more optimized implementation of storing and loading data. So I will add Serializable to Trie and Externalizable to some tries that have load/save method.

takawitter commented 10 years ago

I started to implement. https://github.com/takawitter/trie4j/commit/313e549568b4cf0464328486fe52cad81501207b LOUDSTrie should also has Externalizable.

fedorn commented 10 years ago

Would be great if you also add this to MapDoubleArray

takawitter commented 10 years ago

I'm working on the branch https://github.com/takawitter/trie4j/tree/add-externalizable . Because many classes must be serializable or externalizable to make Tries serializable, I decided to move save/load functionality to serializable-based. The branch will be merged to master after tests.

takawitter commented 10 years ago

Now you can serialize all tries. Here is the list of binary size, save time and load time. ("verified" means all words in wikipedia titles successfully verified)

PatriciaTrie, size: 52429170, write(ms): 1837, read(ms): 2485, verified. MapPatriciaTrie, size: 68380181, write(ms): 3314, read(ms): 1110, verified. TailPatriciaTrie(ConcatTailBuilder), size: 51774677, write(ms): 1894, read(ms): 1723, verified. TailPatriciaTrie(ConcatTailBuilder), size: 50649861, write(ms): 2159, read(ms): 2919, verified. TailPatriciaTrie(SuffixTrieTailBuilder), size: 59124031, write(ms): 2672, read(ms): 3723, verified. MapTailPatriciaTrie(ConcatTailBuilder), size: 76396291, write(ms): 3339, read(ms): 5338, verified. MapTailPatriciaTrie(SuffixTrieTailBuilder), size: 83745645, write(ms): 3892, read(ms): 7550, verified. DoubleArray, size: 53820980, write(ms): 149, read(ms): 115, verified. MapDoubleArray, size: 68381799, write(ms): 970, read(ms): 863, verified. TailDoubleArray(ConcatTailBuilder), size: 37594034, write(ms): 88, read(ms): 104, verified. TailDoubleArray(SuffixTrieTailBuilder), size: 30881077, write(ms): 66, read(ms): 69, verified. MapTailDoubleArray(ConcatTailBuilder), size: 52154857, write(ms): 722, read(ms): 380, verified. MapTailDoubleArray(SuffixTrieTailBuilder), size: 45441900, write(ms): 741, read(ms): 329, verified. TailLOUDSTrie(ConcatTailArray), size: 22469931, write(ms): 62, read(ms): 68, verified. TailLOUDSTrie(SBVConcatTailArray), size: 17287532, write(ms): 31, read(ms): 36, verified. TailLOUDSTrie(SuffixTrieTailArray), size: 15727477, write(ms): 26, read(ms): 29, verified. MapTailLOUDSTrie(ConcatTailArray), size: 37030746, write(ms): 764, read(ms): 251, verified. MapTailLOUDSTrie(SBVConcatTailArray), size: 31848347, write(ms): 638, read(ms): 295, verified. MapTailLOUDSTrie(SuffixTrieTailArray), size: 30288292, write(ms): 776, read(ms): 231, verified. TailLOUDSPPTrie(ConcatTailArray), size: 22361043, write(ms): 45, read(ms): 60, verified. TailLOUDSPPTrie(SBVConcatTailArray), size: 17178644, write(ms): 31, read(ms): 34, verified. TailLOUDSPPTrie(SuffixTrieTailArray), size: 15618589, write(ms): 27, read(ms): 27, verified. MapTailLOUDSPPTrie(ConcatTailArray), size: 36921858, write(ms): 693, read(ms): 330, verified. MapTailLOUDSPPTrie(SBVConcatTailArray), size: 31739459, write(ms): 742, read(ms): 292, verified. MapTailLOUDSPPTrie(SuffixTrieTailArray), size: 30179404, write(ms): 754, read(ms): 225, verified.

Map tries contains Integer value for all 1456061 entries.