Closed zachkinstner closed 11 years ago
This would be used for several purposes. At the broadest level, this database/cache would provide Fabric with a means for performing standard RDBMS operations with "tables", counts, sorting options, etc. For example:
Traversal queries could have a hook where this database/cache would take over for certain lookups. The lookup response would provide a list of node IDs, and the traversal query could use a retain
step to filter based on those IDs.
Next steps are to investigate options for this database/cache. Initial ideas are ElasticSearch, Redis, or a separate Cassandra cluster/keyspace.
This idea may be premature. Investigate Titan indexing more closely. It seems (especially with the Elastic Search integration) that there are better ways to handle Fabric's various indexing needs. Any way I go about it, there will be lots of denormalization. I like the idea of using Titan's transaction capabilities to ensure all related pieces (including all related direct/indirect indexes) are created correctly during an insert.
For now, Fabric will proceed without this additional complexity.
Based on this discussion and general experience with graph databases in the past several months, I think Fabric will need to implement an external database/cache. This database will be responsible for indexing data for global and sub-global (e.g. all "Factor" nodes) searches.