DeepRank / DeepRank-Mut

Deep learning framework to predict functional effects of missense variants in human
Apache License 2.0
1 stars 0 forks source link

Links the variants to epoch data #17

Closed cbaakman closed 2 years ago

cbaakman commented 2 years ago

The deeprank learning algorithm generates a file named epoch_data.hdf5, containing predictions and target values, but little info on the variant. In this change, I increased the amount of information that is stored with the output. An additional hdf5 group called "variants" is added for every epoch. This will hold the variant information, like: structure, residue number amino acid change, protein accession.

cbaakman commented 2 years ago

Thanks Coos. I have one query: Is the preprocess_bioprodict.py a unifying script for both pssms and bioprodict conservation scores? I do see it is combined?

No, it is not. I only added the protein accession to the PdbVariantSelection object. The part where preprocess_bioprodict.py is a unifying script is still in this branch: https://github.com/DeepRank/DeepRank-Mut/tree/conservation-features