phaistos-networks / Trinity

Trinity IR Infrastructure
Apache License 2.0
235 stars 20 forks source link

Replication of index data via using rocksdb as backend #12

Open yashhema opened 6 years ago

yashhema commented 6 years ago

Hello, I will like to replicate my data (one master multiple slaves). Like Lucene , from what I understand , trinity also creates segments (which contains raw data + other related indexed information). Can I store this segment into rocksdb and use something like - https://github.com/pinterest/rocksplicator for replication. Please do tell me your views about this approach or if you have better suggestion. We will have be creating different indexes for different sources (each will be stored as a seperate database in rocksdb), and for some indexes we dont require ranking at all. Is there a setting which we can use, which will make processing fast (by turning ranking off)

markpapadakis commented 6 years ago

Hi @yashhema,

You can implement any storage and access scheme (including what you are describing here), by creating your own IndexSource. See index_source.h for comments and for how that works; essentially you just extend/subclass that class to build whatever you need. It could, for example, cache segments, or parts of segments in ram, and when needed, fetch from a RocksDB instance, or elsewhere etc. The comments in that file should help you understand what it will take.

Thank you