AlexsLemonade / refinebio

Refine.bio harmonizes petabytes of publicly available biological data into ready-to-use datasets for cancer researchers and AI/ML scientists.
https://www.refine.bio/
Other
126 stars 19 forks source link

GEO meta DB availability #3247

Open arkid15r opened 1 year ago

arkid15r commented 1 year ago

Context

In order to process all platform datasets properly we need a GEO meta db source available. As we found out recently one is available at https://gbnci.cancer.gov/geo/GEOmetadb.sqlite.gz

Problem or idea

Add DB file into Docker container for immediate access to GEO meta data.

Future improvements: come up with a long term solution that'll work in a more sophisticated way by obtaining the DB file in a run-time instead of storing it in the Docker image.

Solution or next step