Open GhostSteven opened 5 years ago
Sorry for the ambiguity -- training won't stop automatically. After manually stopping training, I looked at the validation performance of each of the checkpoints and it leveled off at model.ckpt-140000
Thank you very much. I've already repeated the whole process and now I have other two questions:
First, when I ran the validation of ultimate model in rank_diff_wln, i.e , ran nnotest_direct_useScores.py
, it could not validate all the 30,000 examples but just about 29900 and got stuck. So I had to close the Terminal and I found the last line of the generated file valid.cbond_detailed_2400000(others are the same) is not completed such as :| 2.0-4.0- . I don't know why it happend and how to resolve this problem;
Another question is that, after you ran the last testing of the model and you can get the prediction results whatever it's right or wrong on the website by django, but how can you know the wrong prediction is "near-miss" or "complete-miss" ?
I haven't encountered that issue where the last set of examples cannot be validated and the file is incomplete. Is there any chance the process was killed externally?
The "near-miss" and "complete-miss" terms correspond to whether the recorded product was proposed as the second highest-ranked product or whether it was not found in the top 5. That is based on analyzing the results of running eval_by_smiles.py
on the predicted bond edits to perform a final comparison of SMILES strings. This script will write a detailed output like the one in rexgen_direct/rank_diff_wln/model-core16-500-3-max150-direct-useScores/test.cbond_detailed_2400000.eval_by_smiles
that lists the top 10 predicted SMILES for each example and the rank at which the recorded product is found (if it is found in the top 5; otherwise, it will be 11).
Hi, I'm trying retraining the
core_wln_global
model completely followed your notespython nntrain_direct.py --train ../data/train.txt.proc --hidden 300 --depth 3 --save_dir model-300-3-direct | tee model-300-3-direct/log.txt
and it has already generated mode.ckpt-220000 durng last 25 hours. But in the paper you said there are only 140,000 minibatches and it should take 19 hours, is there anything wrong with my process? Or you just mean generating a model with 140,000 minibatches needs 19 hours and I should end the training after generating mode.ckpt-140000?