VuisterLab / cing

Automated Validation of NMR Structures
http://nmr.le.ac.uk
2 stars 4 forks source link

Unannealed (failed) models passed to analyze: CING stops #332

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
In cing/python/Refine/refine.py/def fullAnneal:
Models that failed to anneal (def anneal) and thus did not produce a resulting 
.pdb file are still passed to analyze (def analyze) where the .pdb is of course 
not found. CING stops.

The error is not specific for PDB entries. Below an example for 2lqo, but the 
same errors were found for e.g. 2ksr, 2kt7, 2kt9, 2kum, 2kv3, 2kvv, 2kxg, 2kzh, 
2l22, 2l33, 2lez, 2lf7, 2lf8, 2lfc, 2lg1, 2ls8, 2lsb, 2xnf, 4a5v, 4aqz, 4ar0.

-- anneal --
Doing 200 new process(es)
...
Doing command: 2lqo.cing/Refine/2lqo_redo/Jobs/anneal_72.csh
ERROR: Failed to find resulting file: 
2lqo.cing/Refine/2lqo_redo/Annealed/2lqo_072.pdb
Subprocess will do error exit with exit code 1
Process with pid [1020] exited with status [256]
WARNING: Process with fid: [72] considered NOT done
...
Doing command: 2lqo.cing/Refine/2lqo_redo/Jobs/anneal_199.csh
DEBUG: Subprocess will do normal exit with exit code 0
Process with pid [9277] exited with status [0]
Process with fid: [199] considered done
Finished 195 out of the 200 processes successfully
Finished ids: [<LIST OF 95 SUCCESSFULLY FINISHED IDs>]
...
-- analyze --
DEBUG: inPath:         Annealed
DEBUG: modelList:      [LIST OF ALL 200 IDs]
ERROR: path "2lqo.cing/Refine/2lqo_redo/Annealed/2lqo_072.pdb" does not exist
CING started at : Tue Nov 27 12:08:18 2012
CING stopped at : Tue Nov 27 16:30:34 2012
CING took       : 15735.540 s

CING does not stop in def Anneal, as the code block after the if statement:

done_list = f.forkoff_start(job_list, 0) # delay 0 second between jobs.
nTmessage("Finished ids: %s", done_list)
if not done_list:
    notDone = parameters.modelCountAnneal - len(done_list) 
    nTerror("Failed to anneal %s", notDone)
    return True

is not executed as the done_list exists (albeit having a shorter length).

How do we want to fix this bug?
It probably makes sense to analyze the successfully finished jobs and pick 
models for refinement from that set if the set is large enough.

The log files for model 72, 77, 93, 110 and 170 (the ones that failed, e.g. 
2lqo.cing/Refine/2lqo_redo/Jobs/anneal_72.log) say:
****&&&& rerun job with smaller timestep (i.e., 0.003)
Do we have something that can use these Xplor messages?

Original issue reported on code.google.com by WGTouw on 28 Nov 2012 at 9:28

GoogleCodeExporter commented 9 years ago
In general we will want to continue with an incomplete list of models. It is 
quite acceptable to have a few mishaps. Can you fix the logic?

Usually, the input data is not good enough when that last msg shows up. 
Conflicting restraints/topology will make the MD instable and the molecule 
explodes. 

Original comment by jurge...@gmail.com on 28 Nov 2012 at 1:27

GoogleCodeExporter commented 9 years ago
I have the unfinished directory of 2lqo in $D/tmp/.../unfinished. Can I use 
CING/ipython or something to load everything again, continue and test if code 
modifications work out? Or do you have other suggestions to make testing easier 
(i.e. without having to do the entire run again and anneal 200 models again)?

In refine.py:def analyze I see:

    inPath     = config.directories.converted
    modelList  = asci2list(parameters.models)
    if getDeepByKeysOrAttributes( parameters, USE_ANNEALED_STR):
        inPath     = config.directories.annealed
        modelList  = asci2list(parameters.modelsAnneal)

USE_ANNEALED_STR is set to True in fullAnneal just before calling analyze. I 
cannot figure out how this string is processed in analyze. We get into the if 
statement above, right? If so, what list does modelsAnneal contain? Can we set 
modelsAnneal to the list with successfully annealed id's in def anneal? Or 
should this already be the case (in other words, what is the difference at this 
point between parameters.models and parameters.modelsAnneal)???

Original comment by WGTouw on 28 Nov 2012 at 2:20

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r1196.

Original comment by WGTouw on 29 Nov 2012 at 9:09