jamesgleave / DD_protocol

Official repository for the Deep Docking protocol
MIT License
97 stars 31 forks source link

file doesn't exists #5

Closed FebsN0 closed 1 year ago

FebsN0 commented 1 year ago

https://github.com/jamesgleave/DD_protocol/blob/9c842f6d946a97a5018c899993c70712cbc095fe/scripts_2/simple_job_models.py#L80

Hello, I don't understand where "Mol_ctfile%s.csv" is generated. I have checked all your scripts and I didn't find any command that create such file. Should the variable t_mol be just the total number of compounds in the general database (i.e. ZINC20) ? Supposing that such database has 1 billion, the t_mol is supposed to be 1000? If not, why then divide by 1 million?

FebsN0 commented 1 year ago

Ok, sorry, I have solved. I didn't figured out that that file is actually this https://files.docking.org/zinc20-ML/fingerprints_count.txt the file name was different.

fgentile89 commented 1 year ago

Mol_ct_file_x.csv, where x is the name of your project, is generated in molecular_file_count_updated.py, line 70. The "updated" file with the number of molecules to sample per smiles file is then generated from it (see lines after 70 in the same script). This is done in your fingerprint directory in the first iteration, and in the morgan_1024_predictions folder of the previous iteration from iteration 2.

FebsN0 commented 1 year ago

Oh, perfect! I don't know why by running that previous code, Mol_ct_file_x.csv was not generated. Thank you very much!