Open pauleanderson opened 4 years ago
Am currently working on this. You can start by pulling the two wilmarth repos and followings steps with test_files, OR go ahead and start working on how to make more appropriate in our framework. I would suggest making a list of things the user can supply, and what you can have pre-downloaded or download later on fly. As an example, this API, which is being developed, can grab files for a species https://www.ncbi.nlm.nih.gov/datasets , but also check the organism box at BLASTp https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins. Our default at the start will be "some_species" v human, but in the future, adding in non-human should work and would be awesome.
You will need the .dat file which is here since it is 100mb, which is something that we would just leave on server and maybe update occasionally. https://we.tl/t-I8v7DrEWBo
other useful API EBI https://www.ebi.ac.uk/proteins/api/doc/#/taxonomy UniProt https://www.uniprot.org/help/programmatic_access
Eventually you will see the goal is to get to this UniProt type format.