ebi-pf-team / genome-properties

GNU General Public License v3.0
12 stars 12 forks source link

Bulk upload to the Viewer #62

Open j-kominek opened 3 years ago

j-kominek commented 3 years ago

Hi, I am working on a large scale genomics study of 1000+ taxa and are looking to use genome-properties as a great overview tool for metabolic and other insights. We have Interproscan run on our dataset and also got genome-properties to run locally on the output (since it's easier to upload a 60Kb SUMMARY file than a 400Mb TSV), but we are still limited to uploading our annotations 1-by-1 into the Viewer, which is cumbersome when you're dealing with 1000+ species. Is it possible to do bulk uploads into the Viewer or is is something that can be updated on the website? Or could we somehow clone and run the Viewer on our end, to benefit from the great visualization and step-zoom-in features? Thank you in advance!

Best, -Jacek Kominek

LornaMGnify commented 3 years ago

Hi Jacek, apologies for not responding sooner. I'm afraid we don't offer bulk uploads to the viewer at present. This is something we would like to implement, but unfortunately do not have the funding to allow active development of the interface currently. We have taken note of your suggestion however for when we are able to pursue further developments. Thanks, Lorna.

LeeBergstrand commented 3 years ago

@j-kominek Check out my library: Pygenprop. It allows one to parse IPR5 TSVs, assign properties and steps, and add these assignments to DataFrames that can be used to compare the presence and absence of properties across 1000+ organisms. Afterwards, you can plug this DataFrame into something like Seaborn or Bokeh for visual analysis. Note it might take a while to load up something like 1000+ genomes. It should be noted that the last time I analyzed a 1000+ scale dataset with Pygenprop it took 30-40 min to parse and generate a DataFrame.

A tutorial can be found here: https://github.com/Micromeda/pygenprop/blob/master/docs/source/_static/tutorial/tutorial.ipynb