serratus-bio / open-virome

monorepo for data explorer UI and APIs
http://openvirome.com/
GNU Affero General Public License v3.0
0 stars 0 forks source link

Decide whether to include runs with palmprints in DB queries #78

Open lukepereira opened 1 month ago

lukepereira commented 1 month ago

Currently the app fetches all data even if there are no viruses in a given run. If we want the app to be focused on characterizing viruses, we can exclude a good chunk of runs which would make the app faster overall.

The downside is that it may be useful to display non-viral run metadata in the target vs. background set plots for certain queries.

If we decide to do this, the most performant approach would be to add a boolean column and index for palm_virome to sra. This was already done for the geo table.

ababaian commented 2 weeks ago

This is worth implementing I would say, it will certainly boost performance. Is this stored as an index or is it a separate column added to the table?

lukepereira commented 2 days ago

Previously, Alex implemented by adding an extra column to the geo table.

We can test whether joining on the palm_virome index is fast enough, otherwise we can add an extra column to sra table.