Closed breezykermo closed 3 weeks ago
As Deepti noted during our meeting yesterday, if we are interested in comparing the 'standalone' setting against the 'distributed' setting, we should compare according to the SoTA for each of those settings; not the unoptimized baseline implementations that we have above.
In other words:
SsdReplicated
should be supplemented (or replaced) by:
DramRandomPartitions
should be supplemented (or replaced) by either:
Both of these architectures don't determine a particular index, as they rather specify the way that vectors should be distributed, routed, and then aggregated:
Thus we can test each architecture above with a range of different indexes, such as:
We should also work out what we want to measure in terms of each of these experiments. In #18 we are considering a throughput/latency graph for each architecture/index pair, for example. But it could also be interesting to determine other aspects of the approach, such as: