radical-cybertools / radical.repex.at

This is the github location for RepEx developed by the RADICAL team in conjunction with the York Lab.
Other
4 stars 3 forks source link

TSU stuck in the first T exchange step #71

Closed haoyuanchen closed 8 years ago

haoyuanchen commented 8 years ago

I tried a TSU-run with 32 replica on Stampede but it got stuck in the first T-exchange step. The STDOUT file in the T-exchange unit folder keeps outputting

Waiting for replica: 0

but I've checked that all replicas successfully finished the first MD step.

Also, all the STDERR files says

Resetting modules to system default Lmod has detected the following error: These module(s) exist but cannot be loaded as requested: "amber"

Try: "module spider amber" to see how to load the module(s).

Since the MD step finished normally with the same STDERR file, this might not be causing the error though.

Version: 146b57b659a737990716e3c40aaca6940a44f371 (devel branch)

Thanks!

haoyuanchen commented 8 years ago

I've tried using another example and it works normally...will check the example.

antonst commented 8 years ago

Can you please specify which examples you have used?

haoyuanchen commented 8 years ago

It's the dna_gold example. I believe that all necessary files are in tuu_remd_inputs directory. The running command is:

nohup repex-amber --input='tsu_remd_dna_gold.json' --rconfig='stampede.json' 2 > foo.log &

For other TSU examples there's no such problem.

antonst commented 8 years ago

Thank you, will try to replicate this and see that is the issue

antonst commented 8 years ago

Well, Amber is not currently available as a module on Stampede

antonst commented 8 years ago

There is some module loading/reloading needed to get this working properly. Will update you soon. Thanks!

antonst commented 8 years ago

The issue actually is not a module load, but a format of .RST file. For TSU by default is expected something like:

 umbrella sampling restraints on phi/psi torsions
 &rst iat=17,15,9,7 r1=@val1l@ r2=@val1@ r3=@val1@ r4=@val1h@ rk2=65.656127 rk3=65.656127 /

but dna_gold_us.RST is:

us only on first rst, shift dihedral restraints to positive 
 &rst
  iat=1646,1647
  r1=0 , r2=@val1@ , r3=@val1@ , r4=200 ,
  rk2=0.1, rk3=0.1,
 /
 &rst
  iat=1646,1534
  r1=0 , r2=16.5 , r3=16.5 , r4=50 ,
  rk2=50.0, rk3=50.0,
 /
 &rst
  iat=1646,1534,1533
  r1=0 , r2=164.74, r3=164.74, r4=180 ,
  rk2=50.0, rk3=50.0,
 /
 &rst
  iat=1646,1534,1533,1531
  r1=274.33 , r2=364.33 , r3=364.33, r4=454.33 ,
  rk2=50.0, rk3=50.0,
 /
 &rst
  iat=1646,1534,1533,1538
  r1=92.67 , r2=182.67, r3=182.67, r4=272.67 ,
  rk2=50.0, rk3=50.0,
 /
 &rst
  iat=1647,525
  r1=0 , r2=16.65 , r3=16.65 , r4=50 ,
  rk2=50.0, rk3=50.0,
 /
 &rst
  iat=1647,525, 524
  r1=0 , r2=159.73, r3=159.73, r4=180 ,
  rk2=50.0, rk3=50.0,
 /
 &rst
  iat=1647,525,524,522
  r1=-28.51 , r2=61.49, r3=61.49 , r4=151.49 ,
  rk2=50.0, rk3=50.0,
 /
 &rst
  iat=1647,525,524,529
  r1=145.43 , r2=235.43, r3=235.43, r4=325.43 ,
  rk2=50.0, rk3=50.0,
 /

Previously restraint value we have used during exchange was 'r2=', e.g. @val1@, what should be used now, I don't now.

haoyuanchen commented 8 years ago

For this system there're more restraints than one needed, although we're just doing umbrella sampling on the first restraint. That's why the RST file has more entries, but only the first restraint has @val1@ placeholders.

antonst commented 8 years ago

Fixed now in devel. Please try :-)

marksantcroos commented 8 years ago

nohup repex-amber --input='tsu_remd_dna_gold.json' --rconfig='stampede.json' 2 > foo.log &

Where is this from if I may ask?

haoyuanchen commented 8 years ago

@marksantcroos I just modified from the existing run.sh file.

marksantcroos commented 8 years ago

Ah, ok, thanks. I thought that was a command line of the execution :)

haoyuanchen commented 8 years ago

@antonst It's good now. Thanks!