glennhickey / progressiveCactus

Distribution package for the Prgressive Cactus multiple genome aligner. Dependencies are linked as submodules
Other
80 stars 26 forks source link

Example goes idle and does not progress #39

Open l-schrader opened 9 years ago

l-schrader commented 9 years ago

Hi, I am currently struggling to get progressiveCactus to run. I managed to install progressiveCactus on my torque/pbs cluster, after manually installing argparse and importlib modules with easy_install. When I then try to run the example with source ~/software/progressiveCactus/environment cd ~/software/progressivecactus bin/runProgressiveCactus.sh examples/blanchette00.txt ./work ./work/b00.hal

progressiveCactus starts nicely without complaints.

However, the program does not progress in any way and I don’t see any related processes running. The log file only contains a single line:

Beginning the alignment.

I reran the example in debug mode and here’s what the detailed logfile contained:


Logging to file: cac1.log
Running the command: cactus_analyseAssembly ./submodules/cactusTestData/blanchettesSimulation/00.job/HUMAN
Running the command: cactus_analyseAssembly ./submodules/cactusTestData/blanchettesSimulation/00.job/CHIMP
Running the command: cactus_analyseAssembly ./submodules/cactusTestData/blanchettesSimulation/00.job/BABOON
Running the command: cactus_analyseAssembly ./submodules/cactusTestData/blanchettesSimulation/00.job/MOUSE
Running the command: cactus_analyseAssembly ./submodules/cactusTestData/blanchettesSimulation/00.job/RAT
Running the command: cactus_analyseAssembly ./submodules/cactusTestData/blanchettesSimulation/00.job/DOG
Running the command: cactus_analyseAssembly ./submodules/cactusTestData/blanchettesSimulation/00.job/CAT
Running the command: cactus_analyseAssembly ./submodules/cactusTestData/blanchettesSimulation/00.job/PIG
Running the command: cactus_analyseAssembly ./submodules/cactusTestData/blanchettesSimulation/00.job/COW
Running the command: rm -rf ./work/jobTree
Running the command: mkdir ./work/sequenceData
Running the command: cactus_createMultiCactusProject.py "/data/home/user010/software/progressiveCactus/work/expTemplate.xml" "./work/progressiveAlignment" --fixNames=0
Running the command: rm -f ./work/cactus.log
Running the command: . bin/../src/../environment && cactus_progressive.py --jobTree ./work/jobTree --maxThreads "2" ./work/progressiveAlignment/progressiveAlignment_project.xml --overwrite >> ./work/cactus.log 2>&1

From here on I don't see any progress and the script just hangs. Eventually I have to kill the process, the longest I've kept it running was 2 days.

In addition, I see some idle processes running, which are likely started by progressiveCactus.

43435 user010    20   0  9204 1404 1132 S  0.0  0.0   0:00.00 runProgressiveC                                                                                                        
43441 user010    20   0  184m  18m 3704 S  0.0  0.0   0:00.40 python                                                                                                                  
43461 user010    20   0  187m  18m 3676 S  0.0  0.0   0:00.32 python                                                                                                                  
43462 user010    20   0     0    0    0 Z  0.0  0.0   0:00.00 python                                                                                                         
43463 user010    20   0  121m  15m  764 S  0.0  0.0   0:00.00 python 

Does anyone have an idea what is causing this behaviour and whether there is any chance I can get progressiveCactus running on this system?

l-schrader commented 9 years ago

I posted the same issue at seqanswers, in case anyone wants to discuss it there.

http://seqanswers.com/forums/showthread.php?t=61372

joelarmstrong commented 9 years ago

Hi,

Sorry for the delayed reply. Sorry to hear that things aren't working correctly too. It's interesting that this isn't progressing at all and the python processes seem totally stalled.

Maybe this is an issue with our interaction with the version of python you're using? argparse should be included in most modern python versions.

Can you run python --version? We've been running it on the 2.7.x series (not 3.x sadly). I'm not sure if anyone's tested it on earlier python versions.

l-schrader commented 9 years ago

Hi, thanks for your response. You are right in that my cluster is running python 2.6.6 by default. I recompiled progressiveCactus in a biopython virtualenv running Python 2.7.8. After make finished I ran the example again and got a proper error this time:


Logging set at level: DEBUG
Logging to file: cac1.log
Running the command: cactus_analyseAssembly ./submodules/cactusTestData/blanchettesSimulation/00.job/HUMAN
Running the command: cactus_analyseAssembly ./submodules/cactusTestData/blanchettesSimulation/00.job/CHIMP
Running the command: cactus_analyseAssembly ./submodules/cactusTestData/blanchettesSimulation/00.job/BABOON
Running the command: cactus_analyseAssembly ./submodules/cactusTestData/blanchettesSimulation/00.job/MOUSE
Running the command: cactus_analyseAssembly ./submodules/cactusTestData/blanchettesSimulation/00.job/RAT
Running the command: cactus_analyseAssembly ./submodules/cactusTestData/blanchettesSimulation/00.job/DOG
Running the command: cactus_analyseAssembly ./submodules/cactusTestData/blanchettesSimulation/00.job/CAT
Running the command: cactus_analyseAssembly ./submodules/cactusTestData/blanchettesSimulation/00.job/PIG
Running the command: cactus_analyseAssembly ./submodules/cactusTestData/blanchettesSimulation/00.job/COW
Beginning Alignment
Running the command: rm -rf ./work/jobTree
Running the command: mkdir ./work/sequenceData
Running the command: cactus_createMultiCactusProject.py "/data/home/mpx044/software/progressiveCactus/work/expTemplate.xml" "./work/progressiveAlignment" --fixNames=0
Running the command: rm -f ./work/cactus.log
Running the command: . bin/../src/../environment && cactus_progressive.py --jobTree ./work/jobTree --maxThreads "8" ./work/progressiveAlignment/progressiveAlignment_project.xml --overwrite >> ./work/cactus.log 2>&1
Running the command: jobTreeStatus --failIfNotComplete --jobTree ./work/jobTree > /dev/null 2>&1 
Error: Command: jobTreeStatus --failIfNotComplete --jobTree ./work/jobTree > /dev/null 2>&1  exited with non-zero status 1
Temporary data was left in: ./work
More information can be found in ./work/cactus.log

Here's the log file:


2015-07-25 14:39:32.262933: Beginning Progressive Cactus Alignment
The job seems to have left a log file, indicating failure: /data/home/mpx044/software/progressiveCactus/work/jobTree/jobs/job
Reporting file: /data/home/mpx044/software/progressiveCactus/work/jobTree/jobs/log.txt
log.txt:        ---JOBTREE SLAVE OUTPUT LOG---
log.txt:        Traceback (most recent call last):
log.txt:          File "/data/home/mpx044/software/progressiveCactus/submodules/jobTree/src/jobTreeSlave.py", line 271, in main
log.txt:            defaultMemory=defaultMemory, defaultCpu=defaultCpu, depth=depth)
log.txt:          File "/data/home/mpx044/software/progressiveCactus/submodules/jobTree/scriptTree/stack.py", line 153, in execute
log.txt:            self.target.run()
log.txt:          File "/data/home/mpx044/software/progressiveCactus/submodules/cactus/progressive/cactus_progressive.py", line 202, in run
log.txt:            schedule.compute()
log.txt:          File "/data/home/mpx044/software/progressiveCactus/submodules/cactus/progressive/schedule.py", line 131, in compute
log.txt:            self.enforceMaxParallel()
log.txt:          File "/data/home/mpx044/software/progressiveCactus/submodules/cactus/progressive/schedule.py", line 168, in enforceMaxParallel
log.txt:            for node in NX.bfs_tree(tree, root):
log.txt:        AttributeError: 'module' object has no attribute 'bfs_tree'
log.txt:        Exiting the slave because of a failed job on host dn126
log.txt:        Due to failure we are reducing the remaining retry count of job /data/home/mpx044/software/progressiveCactus/work/jobTree/jobs/job to 0
log.txt:        We have set the default memory of the failed job to 2147483648.0 bytes
Job: /data/home/mpx044/software/progressiveCactus/work/jobTree/jobs/job is completely failed
2015-07-25 14:39:32.604414: Finished Progressive Cactus Alignment

The log file indicates that there is a failed job at progressiveCactus/work/jobTree/jobs/log.txt

Here's the jobTree logfile:


---JOBTREE SLAVE OUTPUT LOG---
Traceback (most recent call last):
  File "/data/home/mpx044/software/progressiveCactus/submodules/jobTree/src/jobTreeSlave.py", line 271, in main
    defaultMemory=defaultMemory, defaultCpu=defaultCpu, depth=depth)
  File "/data/home/mpx044/software/progressiveCactus/submodules/jobTree/scriptTree/stack.py", line 153, in execute
    self.target.run()
  File "/data/home/mpx044/software/progressiveCactus/submodules/cactus/progressive/cactus_progressive.py", line 202, in run
    schedule.compute()
  File "/data/home/mpx044/software/progressiveCactus/submodules/cactus/progressive/schedule.py", line 131, in compute
    self.enforceMaxParallel()
  File "/data/home/mpx044/software/progressiveCactus/submodules/cactus/progressive/schedule.py", line 168, in enforceMaxParallel
    for node in NX.bfs_tree(tree, root):
AttributeError: 'module' object has no attribute 'bfs_tree'
Exiting the slave because of a failed job on host dn126
Due to failure we are reducing the remaining retry count of job /data/home/mpx044/software/progressiveCactus/work/jobTree/jobs/job to 0
We have set the default memory of the failed job to 2147483648.0 bytes
Tetrajf commented 8 years ago

I had the same problem with the python install and thanks to your advise I did the gitpull in a virtual environment with python 2.7 and the script seems to be working besides other problems I'm having.