drugdata / D3R

Drug Design Data Resource is a suite of software to enable filtering, docking, and scoring of new sequences from wwpdb.
Other
22 stars 10 forks source link

2018 week 5 challenge package was made empty #175

Open j-wags opened 6 years ago

j-wags commented 6 years ago

This is because stage.2.dataimport downloaded an empty Components-inchi file. I fixed this by copying in the Components-inchi file from 2018 week 4 and rerunning on both staging and production. The original folders are still in place with the suffix ".orig".

I don't know why this happened, or if it will happen again next week.

coleslaw481 commented 6 years ago

It seems the RCSB has a problem. The file is empty on the server (http://ligand-expo.rcsb.org/dictionaries/Components-inchi.ich) and the code sees no error. Should blastnfilter just error out if this file is 0 size like this or should we try to copy from a previous week?

coleslaw481 commented 6 years ago

I sent an email to the contact email on RCSB website informing them of the empty file.

coleslaw481 commented 6 years ago

Just got this reply from rcsb:

Dear Chris -- Thank you for your message and sorry for the inconvenience. We are experiencing an issue with the ligand tool, and we are in the stage of fixing it.
mkgilson commented 6 years ago

Hi Chris,

How does BnF use this file?

regards Mike

On 1/29/2018 6:46 AM, Chris Churas wrote:

It seems the RCSB has a problem. The file is empty on the server (http://ligand-expo.rcsb.org/dictionaries/Components-inchi.ich) and the code sees no error. Should blastnfilter just error out if this file is 0 size like this or should we try to copy from a previous week?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/drugdata/D3R/issues/175#issuecomment-361267467, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AQEJQPixLpsOLHkkw0r43dMnMewhk5KZks5tPdnVgaJpZM4RvayS.

coleslaw481 commented 6 years ago

It looks like the Ligand class in D3R/src/blast/ligand.py uses inchi strings from Components-inchi.ich file to create an rdkit.Chem.rdmol object for the ligand. This rdmol object is used for analysis in the Ligand class for methods such as:

set_heavy_size() -- Gets # of heavy atoms set_size() -- # of Atoms set_rot() -- # of rotate-able bounds mcss(ref) -- MCSS between ligand and ref molecule calc_tanimoto(ref) -- Calculates tanimoto similarity score set_symmetry() -- calculates symmetry using openeye OEMCSSearch function

mkgilson commented 6 years ago

For the new structure from the FTP file, we have the inchi string and don't need to access components, right? So this would be only for ligands already in the PDB, which we are using to generate candidate structures. Right?

thanks Mike

On 1/29/2018 9:33 AM, Chris Churas wrote:

It looks like the Ligand class in D3R/src/blast/ligand.py uses inchi strings from Components-inchi.ich file to create an rdkit.Chem.rdmol object for the ligand. This rdmol object is used for analysis in the Ligand class for methods such as:

set_heavy_size() -- Gets # of heavy atoms set_size() -- # of Atoms set_rot() -- # of rotate-able bounds mcss(ref) -- MCSS between ligand and ref molecule calc_tanimoto(ref) -- Calculates tanimoto similarity score set_symmetry() -- calculates symmetry using openeye OEMCSSearch function

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/drugdata/D3R/issues/175#issuecomment-361321900, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AQEJQHcGld40Vcntmh4SD-J-vhtmxHNSks5tPgA0gaJpZM4RvayS.

j-wags commented 6 years ago

Hi Mike -- Yes. It's for the candidate selection.

Thanks for looking into this, Chris.

Given that this error will occur on weekends and we might not catch it, the proper behavior in this case is probably to just copy the last week's Components-inchi file. At the same time, it would be good to somehow mark if it has been getting copied up for multiple weeks in a row, so we'll have evidence if something is long-term broken in the future.

mkgilson commented 6 years ago

agreed.

On 1/29/2018 9:43 AM, j-wags wrote:

Hi Mike -- Yes. It's for the candidate selection.

Thanks for looking into this, Chris.

Given that this error will occur on weekends and we might not catch it, the proper behavior in this case is probably to just copy the last week's Components-inchi file. At the same time, it would be good to somehow mark if it has been getting copied up for multiple weeks in a row, so we'll have evidence if something is long-term broken in the future.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/drugdata/D3R/issues/175#issuecomment-361325859, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AQEJQFFkBRa8Jt2cE10S4JgS13EWFqYcks5tPgNCgaJpZM4RvayS.

j-wags commented 6 years ago

For recordkeeping purposes -- The same error did not happen again this week. The PDB must have fixed their download page.