cultivarium / GenomeSPOT

Predict oxygen, temperature, salinity, and pH preferences of bacteria and archaea from a genome
https://cultivarium.org/
MIT License
27 stars 1 forks source link

Predict Physical and Chemical Conditions from a Genome #1

Closed tylerbarnum closed 6 months ago

tylerbarnum commented 7 months ago

This PR enables the use of the models to predict growth conditions for bacteria and archaea. It does not yet allow the user to reproduce the development or evaluation of the models.

To quickly test: you should be able to run this and have a result printed and a file test.predictions.tsv produced:

git clone -b tyler-genomic-features https://github.com/cultivarium/predict-media-physicochemistry.git
cd predict-media-physicochemistry
python -m venv venv
source venv/bin/activate
pip install -r requirements
python src/predict_physicochemistry.py -c tests/test_data/GCA_000172155.1_ASM17215v1_genomic.fna.gz -p tests/test_data/GCA_000172155.1_ASM17215v1_protein.faa.gz -o test.predictions.tsv

As noted in the README, you can also check out the notebook notebooks/tutorial.ipynb to see how individual functions work. This could be a helpful way you and users to understand the code before diving into the codebase. In particular the code to compute traits from BacDive download will make a lot more sense when you see that the data is a wacky nested dictionary.

Yet to come:

Other considerations:

knightjdr commented 7 months ago

For @alexcritschristoph, change pip install -r requirements to pip install -r requirements.txt in the instructions above. Technically also for @tylerbarnum, but want to make sure Alex sees.

knightjdr commented 7 months ago

Needs a .gitignore. A suitable one is here: https://gist.github.com/knightjdr/dc89bd33a4b21546c15822034871580b.

knightjdr commented 7 months ago

Issues:

tylerbarnum commented 7 months ago

@knightjdr Any strong feelings on bumping the python version closer to current? I'm on 3.8 because a package that I no longer use required it

knightjdr commented 7 months ago

@knightjdr Any strong feelings on bumping the python version closer to current? I'm on 3.8 because a package that I no longer use required it

I would specify it at >= 3.8. Not everyone will be using a newer version, so it's good to provide some flexibility here. Compatibility with newer versions shouldn't be an issue.

tylerbarnum commented 6 months ago

Adding these from Alex to "yet to come" (next PR):