mc2-center / pubmed-crawler

PubMed Crawler for CCKP publication manifest
Creative Commons Zero v1.0 Universal
1 stars 0 forks source link

Check out `ffq` tool for metadata retrieval from GEO/SRA #21

Open jaeddy opened 1 year ago

jaeddy commented 1 year ago

Not sure if we're currently doing any semi-automated dataset metadata retrieval from NCBI, but this tool from the Pachter lab might be useful: https://academic.oup.com/bioinformatics/article/39/1/btac667/6971839

If nothing else, having a clear relational model between NCBI repositories and entities (which is not well documented) could help with our approach to crawling.

image

cc @milen-sage, @vpchung, @lakikowolfe, @miekohash

milen-sage commented 1 year ago

Who owns this repo? Could we add the FAIR Data team to it (I can't currently tag people's usernames here)?

There are a couple of people on FAIR Data that are reviewing different strategies for extracting and inferring data and this issue would be relevant for them.

vpchung commented 1 year ago

@milen-sage I can add them - let me know which usernames to add.

vpchung commented 1 year ago

@jaeddy would you say this is related to #4 ? Maybe even replaced the original suggested approach?

milen-sage commented 1 year ago

@vpchung could you add @GiaJordan and @mialy-defelice? Thanks!