Closed wnojopra closed 5 years ago
@melissachang Thanks for the reviews, I added the updates.
Thanks @melissachang , made the changes.
@melissachang I made the changes we discussed, in regards to structuring the samples data.
@melissachang Thanks, I've included the updates.
Fixes #127 .
Previously, for each table in the dataset we would push its indexed data up to elasticsearch. Unfortunately it seems that each additional table gets slower as it re-indexes. The idea of this fix is that we keep all table data in memory before pushing it up to the elasticsearch table. Here we do it for both samples and participants.
Testing with the baseline cdr data finished fairly quickly (~30-40 mins) and used an additional roughly 1.1 GB of memoryc
Testing with the 1000 genomes data (to test samples) was also successful.