Open rajeshdparmar opened 13 years ago
It looks to me that there must have been no structure
in the reaction that was being checked by reactionPrunableQ
. This occured when iterating over PathReactions
from a PDepNetwork
, when iterating over all Networks, in order to find which PathReactions to prune. Why could a PathReaction not have a structure? And why did nine path reactions in earlier PDepNetworks have reverse structures not in any ReactionTemplate dictionaries?
@rajeshdparmar, do you still have the log file and pruning folder? If so, please could you point us to the right folder on the server? (And if not, please run it again and keep the results.)
Please find the log file in the following directory on the server rajesh@pharos:~/Rajesh/New_jobs/MultiT_pdep_Prun_15000_single_concentration
Thanks Rajesh - I forgot to look in your New_jobs folder.
The reaction that gave the warning Removed path reverse SPC(35105)=SPC(35107) from 0 dictionaries
is an intra_H_migration reaction, and it was created in both directions:
RMG.log:Created new intra_H_migration reaction: C14H23J(35105) --> C14H23J(35107)
RMG.log:Created new intra_H_migration reaction: C14H23J(35107) --> C14H23J(35105)
I guess something strange happened like we removed one direction, and its reverse, then tried to remove the other direction, and it's reverse. I'm not sure what happens with reverse reactions of families that are their own reverse. Perhaps @jwallen or @mrharper have a clue.
Other job also failed due to similar error. Please find the folder for the job as follows: rajesh@pharos:~/Rajesh/New_jobs/MultiT_pdep_Prun_15000$
The MultiT_pdep_Prun_15000 job ended:
Pruning...
Removed path reverse SPC(8815)=SPC(9089) from 0 dictionaries
Removed path reverse SPC(17408)=SPC(17818) from 0 dictionaries
Removed path reverse SPC(13368)=SPC(21650) from 0 dictionaries
Removed path reverse SPC(13368)=SPC(21647) from 0 dictionaries
ERROR: java.lang.NullPointerException
at jing.rxn.Reaction.getReactants(Reaction.java:1091)
at jing.rxnSys.ReactionModelGenerator.reactionPrunableQ(ReactionModelGenerator.java:4223)
at jing.rxnSys.ReactionModelGenerator.pruneReactionModel(ReactionModelGenerator.java:4111)
at jing.rxnSys.ReactionModelGenerator.modelGeneration(ReactionModelGenerator.java:1458)
at RMG.main(RMG.java:96)
Looking at the last of these, (13368) has not been selected for pruning but (21647) has.
Again, the Removed path reverse <reaction> from 0 dictionaries
all come from intra_H_migration, thus both forward and reverse are in the same reaction template dictionary. How many copies of each we have in various lists, in either direction, I'm not sure.
The reactions A->B and B->A are both created as intra_H_migration reactions. B->A is in a pdep network for B and A->B is in a separate pdep network for A.
First up for pruning is A->B.
ReactionModelGenerator.java lines 4156-4161 remove both A->B and its reverse from the intra_H_migration dictionary,
then lines 4162 and 4163 call .prune()
on the forward and reverse reactions; this sets the structures equal to null
.
Then it is B->A's turn for pruning. Somehow in line 4147 the reaction.getStructure() returns a valid forward structure (the reverse of A->B, which got wiped, was a separate instance of the same B->A reaction - we never checked the global list of reactions for duplicates of this type of duplicate) . The reverse structure is found/generated from this, then both are removed from the intra_H_migration dictionary. The forward is found and removed (how? shouldn't it have been removed earlier?), but the reverse structure (A->B) is already missing because it was removed above. This gives a "Removed path reverse %s from 0 dictionaries" warning.
or, to get the NullPointerException...
When trying to decide if B->A can be pruned, we check reactionPrunableQ
and find that it has no structure, and hence die. This would be because the .prune()
call was on the same instance of the reaction as the one we've already pruned. (But why is the same reaction in two pdep networks?)
Going back to @rwest question on the following:
RMG.log:Created new intra_H_migration reaction: C14H23J(35105) --> C14H23J(35107) RMG.log:Created new intra_H_migration reaction: C14H23J(35107) --> C14H23J(35105)
These lines appear consecutively in the RMG.log file. The reason this appears in both directions is because the first structure is in the "reverse" direction (based on thermodynamics, since this reaction template is its own reverse). (Looking at the makeTemplateReaction() function in TemplateReaction class) The first reaction is made (and we print "Created new ..."). Since the structure as written is in the reverse direction:
Long story short, I do not think we have duplicate instances of any intra_H_migration reactions in the template library. We only get both directions printed to screen because the reverse structure was made first. If the forward was made first, we would only get one "Created new reaction" printed. MRH confirmed this by starting with only 1-butyl in a RMG simulation (only one print statement) and only 2-butyl in a RMG simulation (two print statements).
Third job also filed due to similar error. Please find the folder for the job in: rajesh@pharos:~/Rajesh/New_jobs/MultiT_pdep_Prun_15000_Reduced$
Pruning... Removed path reverse SPC(45087)=SPC(77812) from 0 dictionaries Removed path reverse SPC(45087)=SPC(52449) from 0 dictionaries Removed path reverse SPC(45087)=SPC(77813) from 0 dictionaries Removed path reverse SPC(45087)=SPC(77801) from 0 dictionaries Removed path reverse SPC(48364)=SPC(52432) from 0 dictionaries Removed path reverse SPC(45087)=SPC(45096) from 0 dictionaries Removed path reverse SPC(45087)=SPC(52447) from 0 dictionaries Removed path reverse SPC(45087)=SPC(52445) from 0 dictionaries Removed path reverse SPC(45087)=SPC(77799) from 0 dictionaries Removed path reverse SPC(45087)=SPC(77818) from 0 dictionaries Removed path reverse SPC(45087)=SPC(45088) from 0 dictionaries Removed path reverse SPC(48364)=SPC(52442) from 0 dictionaries Removed path reverse SPC(45087)=SPC(45092) from 0 dictionaries Removed path reverse SPC(48364)=SPC(52447) from 0 dictionaries Removed path reverse SPC(45087)=SPC(52450) from 0 dictionaries Removed path reverse SPC(48364)=SPC(52458) from 0 dictionaries Removed path reverse SPC(48364)=SPC(52441) from 0 dictionaries Removed path reverse SPC(45087)=SPC(77802) from 0 dictionaries ERROR: java.lang.NullPointerException at jing.rxn.Reaction.getReactants(Reaction.java:1091) at jing.rxnSys.ReactionModelGenerator.reactionPrunableQ(ReactionModelGenerator.java:4223) at jing.rxnSys.ReactionModelGenerator.pruneReactionModel(ReactionModelGenerator.java:4111) at jing.rxnSys.ReactionModelGenerator.modelGeneration(ReactionModelGenerator.java:1458) at RMG.main(RMG.java:96) Exception in thread "main" java.lang.NullPointerException at jing.rxnSys.Logger.log(Logger.java:160) at jing.rxnSys.Logger.critical(Logger.java:204) at RMG.main(RMG.java:106)
Ok, so according to @mrharper's last comment, my assumption at the top of my previous comment is slightly wrong. Perhaps we make the same path reaction twice in different PDepNetworks? the second or third time we make it, we don't print the "Created new reaction" line.
Another example from @rajeshdparmar (listed as issue #188)
Folder ~/Rajesh/New_jobs/MultiT_PM3_pdep_Prun_15000
Error
Pruning... ERROR: java.lang.NullPointerException at jing.rxn.Reaction.getReactants(Reaction.java:1091) at jing.rxnSys.ReactionModelGenerator.reactionPrunableQ(ReactionModelGenerator.java:4223) at jing.rxnSys.ReactionModelGenerator.pruneReactionModel(ReactionModelGenerator.java:4111) at jing.rxnSys.ReactionModelGenerator.modelGeneration(ReactionModelGenerator.java:1458) at RMG.main(RMG.java:96) Exception in thread "main" java.lang.NullPointerException at jing.rxnSys.Logger.log(Logger.java:160) at jing.rxnSys.Logger.critical(Logger.java:204) at RMG.main(RMG.java:106)
another job also failed folder: ~/Rajesh/New_jobs/MultiT_PM3_pdep_Prun_15000_single_concentration$
Error
Pruning... Removed path reverse SPC(6922)=SPC(41457) from 0 dictionaries Removed path reverse SPC(6922)=SPC(41459) from 0 dictionaries Removed path reverse SPC(6922)=SPC(41461) from 0 dictionaries ERROR: java.lang.NullPointerException at jing.rxn.Reaction.getReactants(Reaction.java:1091) at jing.rxnSys.ReactionModelGenerator.reactionPrunableQ(ReactionModelGenerator.java:4223) at jing.rxnSys.ReactionModelGenerator.pruneReactionModel(ReactionModelGenerator.java:4111) at jing.rxnSys.ReactionModelGenerator.modelGeneration(ReactionModelGenerator.java:1458) at RMG.main(RMG.java:96) Exception in thread "main" java.lang.NullPointerException at jing.rxnSys.Logger.log(Logger.java:160) at jing.rxnSys.Logger.critical(Logger.java:204) at RMG.main(RMG.java:106)
We seem to be stuck. Perhaps we could just wrap this in a try/catch block and carry on? @gmagoon, do you have any thoughts?
I haven't been following this too closely, but it sounds like maybe some additional debugging lines would be helpful?
I've added a number of debugging lines in ed86222a60bb61eb99420df1c29a3d8056486a4b . If no-one sees any issue with these debugging lines or has anything else to add, I suggest that @rajeshdparmar try to rerun the problematic case(s) with this version.
I've updated the debugging a bit. I suggest @rajeshdparmar now try it again, and let us know how it works out. We should check the logs for debugging messages, but I expect (hope) that the jobs will now at least continue.
After updating on June 28, I am getting same error when PM3 and pdep is ON. Two of my jobs failed due to same error.
Folder location
~/Rajesh/New_jobs/New_jobs_after_June_28/MultiT_PM3_pdep_Prun_1000_single_concentration$
~/Rajesh/New_jobs/New_jobs_after_June_28/MultiT_PM3_pdep_Prun_500_single_concentration$
The exception that caused these latest crashes is coming from a different place. It is now in:
ERROR: java.lang.NullPointerException
at jing.rxn.Reaction.getDirection(Reaction.java:924)
at jing.rxnSys.ReactionModelGenerator.writePDepNetworks(ReactionModelGenerator.java:2794)
at jing.rxnSys.ReactionModelGenerator.modelGeneration(ReactionModelGenerator.java:1565)
at RMG.main(RMG.java:96)
However, it does occur soon after @gmagoon's DEBUGGING LINES
from ed86222a60bb61eb99420df1c29a3d8056486a4b
tail -n1000 /home/rajesh/Rajesh/New_jobs/New_jobs_after_June_28/MultiT_PM3_pdep_Prun_500_single_concentration/RMG.log
Removed path reverse SPC(5981)=SPC(35759) from 0 dictionaries
Removed path reverse SPC(5981)=SPC(35757) from 0 dictionaries
Removed path reverse SPC(3980)=SPC(3982) from 0 dictionaries
Removed path reverse SPC(3980)=SPC(3984) from 0 dictionaries
CRITICAL: ******DEBUGGING LINES FOLLOW******
ERROR: NullPointerException when inspecting Path Reaction
ERROR: java.lang.NullPointerException
at jing.rxnSys.Logger.log(Logger.java:160)
at jing.rxnSys.Logger.critical(Logger.java:204)
at jing.rxn.Reaction.getReactants(Reaction.java:1097)
at jing.rxnSys.ReactionModelGenerator.reactionPrunableQ(ReactionModelGenerator.java:4236)
at jing.rxnSys.ReactionModelGenerator.pruneReactionModel(ReactionModelGenerator.java:4112)
at jing.rxnSys.ReactionModelGenerator.modelGeneration(ReactionModelGenerator.java:1458)
at RMG.main(RMG.java:96)
ERROR: Path reaction will not be pruned. Here is the network:
ERROR: PDepNetwork #8271:
Isomers:
C14H27J(3977) (included =true)
C14H27J(3989) (included =false)
C14H27J(3987) (included =false)
C14H27J(3983) (included =false)
C14H27J(5985) (included =false)
C14H27J(5984) (included =false)
C14H27J(5987) (included =false)
C14H27J(4767) (included =false)
C14H27J(5986) (included =false)
C14H27J(5981) (included =false)
C14H26(13932) + H(29) (included =true)
Path reactions:
C14H26(13932) + H(29) (included =true) --> C14H27J(3977) (included =true)
C14H27J(3989) (included =false) --> C14H27J(3977) (included =true)
C14H27J(3987) (included =false) --> C14H27J(3977) (included =true)
C14H27J(3977) (included =true) --> C14H27J(3983) (included =false)
C14H27J(5985) (included =false) --> C14H27J(3977) (included =true)
C14H27J(5984) (included =false) --> C14H27J(3977) (included =true)
C14H27J(5987) (included =false) --> C14H27J(3977) (included =true)
C14H27J(4767) (included =false) --> C14H27J(3977) (included =true)
C14H27J(5986) (included =false) --> C14H27J(3977) (included =true)
C14H27J(5981) (included =false) --> C14H27J(3977) (included =true)
Net reactions:
C14H26(13932) + H(29) (included =true) <=> C14H27J(3977) (included =true)
Nonincluded reactions:
C14H27J(3977) (included =true) --> C14H27J(3989) (included =false)
C14H26(13932) + H(29) (included =true) --> C14H27J(3989) (included =false)
C14H27J(3977) (included =true) --> C14H27J(3987) (included =false)
C14H26(13932) + H(29) (included =true) --> C14H27J(3987) (included =false)
C14H27J(3977) (included =true) --> C14H27J(3983) (included =false)
C14H26(13932) + H(29) (included =true) --> C14H27J(3983) (included =false)
C14H27J(3977) (included =true) --> C14H27J(5985) (included =false)
C14H26(13932) + H(29) (included =true) --> C14H27J(5985) (included =false)
C14H27J(3977) (included =true) --> C14H27J(5984) (included =false)
C14H26(13932) + H(29) (included =true) --> C14H27J(5984) (included =false)
C14H27J(3977) (included =true) --> C14H27J(5987) (included =false)
C14H26(13932) + H(29) (included =true) --> C14H27J(5987) (included =false)
C14H27J(3977) (included =true) --> C14H27J(4767) (included =false)
C14H26(13932) + H(29) (included =true) --> C14H27J(4767) (included =false)
C14H27J(3977) (included =true) --> C14H27J(5986) (included =false)
C14H26(13932) + H(29) (included =true) --> C14H27J(5986) (included =false)
C14H27J(3977) (included =true) --> C14H27J(5981) (included =false)
C14H26(13932) + H(29) (included =true) --> C14H27J(5981) (included =false)
Number of species pruned: 1342
Memory used before pruning: 187.90 MB
Memory used after pruning: 165.70 MB
Memory recovered by pruning: 22.21 MB
I'm a little confused here...as far as I can tell this also includes Richard's modifications to the debugging lines from 6ab3c6f, which should carry on rather than crash?
Yes, it does carry on. It crashes a little later with a different NPE
Richard
On Jul 5, 2011, at 8:28 PM, gmagoon reply@reply.github.com wrote:
I'm a little confused here...as far as I can tell this also includes Richard's modifications to the debugging lines from 6ab3c6f, which should carry on rather than crash?
Reply to this email directly or view it on GitHub: https://github.com/GreenGroup/RMG-Java/issues/185#issuecomment-1508552
OK thanks...I'm still trying to figure out why it doesn't seem to print any of the info from my debugging lines.
@rajeshdparmar, could you pull in the latest changes in d9b5466ef761fb13778509b087fac73b1d68476b and retry? I've added some additional debugging lines in addition to flushing the logger...an unflushed logger is the only thing I can think of for why my debugging lines weren't printed
What if e.getStackTrace().toString()
raised an exception? Wouldn't the catch block terminate (before printing your lines) and then my bit would catch the exception and print my lines? That would lead to this behaviour?
Does e.getStackTrace().toString()
method even exist or work? see http://stackoverflow.com/questions/1149703/stacktrace-to-string-in-java)
Perhaps better to use Logger.logStackTrace(e);
?
Thanks @rwest. @rajeshdparmar, try d2fe67c037ea147c97935a9eb8bb265e030bc479 instead .
This (or more specifically, issue #246) just occurred again, after 50 hours. The model core has 12978 reactions and 244 species. The model edge has 290179 reactions and 21337 species.
Writing Restart Core Species
Writing Restart Core Reactions
Writing Restart Edge Species
Writing Restart Edge Reactions
ERROR: java.lang.NullPointerException
at jing.rxn.Reaction.getDirection(Reaction.java:924)
at jing.rxnSys.ReactionModelGenerator.writePDepNetworks(ReactionModelGenerator.java:2838)
at jing.rxnSys.ReactionModelGenerator.modelGeneration(ReactionModelGenerator.java:1588)
at RMG.main(RMG.java:96)
RMG execution terminated at 2012-07-05 18:34:20
One of the job (out of 6 submitted jobs [3 jobs PM3 ON and 3 jobs PM3 off]) has given me following error. Condition file and error message is as follows: (Note: It is a case of PM3 OFF but pdep is ON)
Error Message
condition file