databio / bbconf

Configuration package for bedbase project
https://pypi.org/project/bbconf/
BSD 2-Clause "Simplified" License
1 stars 2 forks source link

Making sure we have correct indexes on pipestat tables #19

Closed nsheff closed 8 months ago

nsheff commented 10 months ago

Right now, the bedhost-ui is issuing a bunch of GET requests to /bed/{md5sum}/file/{id}.

These were very slow. I realized they are each doing a database query on the md5sum attribute. (Like, several seconds per query). So, I just added an index on the bedfiles table for the md5sum column -- and now it's like lightning

I guess this is for the old schema. In the new bbconf, we will need to make sure we can put the correct indexes on the pipestat tables.

nsheff commented 8 months ago

@donaldcampbelljr does pipestat have a mechanism whereby you can use the schema to declare which columns (attributes) should have indexes? I think what makes most sense is:

No manual work needs to be done.

I think in the case of bedbase, we no longer really need this, since the md5sum column was removed (#23) and all the selects are correctly happening on the record identifier...

But sitll this could be a good idea for pipestat. So I'll close this issue here and move the idea pipestat.