Closed cbaakman closed 2 years ago
Thanks Coos. I have one query: Is the preprocess_bioprodict.py a unifying script for both pssms and bioprodict conservation scores? I do see it is combined?
No, it is not. I only added the protein accession to the PdbVariantSelection object. The part where preprocess_bioprodict.py is a unifying script is still in this branch: https://github.com/DeepRank/DeepRank-Mut/tree/conservation-features
The deeprank learning algorithm generates a file named epoch_data.hdf5, containing predictions and target values, but little info on the variant. In this change, I increased the amount of information that is stored with the output. An additional hdf5 group called "variants" is added for every epoch. This will hold the variant information, like: structure, residue number amino acid change, protein accession.