Open lpnunez opened 3 years ago
Technically, i am not sure this is a phyluce error. Pilon, which is doing the phasing, uses lots of RAM and seems to be getting too little of that RAM to work. You might try to run this on a node with lots of RAM, but tell phyluce to use only one core (versus 30). then, all of the RAM on the node should be dedicated to pilon when the code reaches that step. I’d try that with a single sample to see what happens. that said, you could still have a problem with too little RAM depending on your HPC setup (e.g. there still may be too little RAM per node).
Thank you for the quick response!
I did as you suggested, but unfortunately, the error persists. It occurs right when the Pilon jobs start as you said. For the record, the HBC cluster that I'm using has 256 GB per node. Here is my setup when I run a job script:
!/bin/bash
PBS -V
PBS -q batch
PBS -S /bin/bash
PBS -N Phase_Test
PBS -e /home/lnunez/nas5/UCE/Temp/phase_e
PBS -o /home/lnunez/nas5/UCE/Temp/phase_o
PBS -l nodes=1
PBS -l ncpus=56
PBS -l walltime=99:00:00
PBS -l mem=100GB
As you mentioned, this is probably a logistical issue on my end. Here is the log file for the test run: phase_test.log
You could try to run the pilon command on its own to see what happens (it will likely die, but can help diagnose). You may need the help of a sysadmin to further diagnose the issue.
It might also be reasonable to try to install pilon outside of phyluce, and see if this step will work with a different installation. I do know that the process works, because phyluce runs software tests against this and other programs, and those work ok. What I don't know is exactly what is causing the RAM allocation error on your system (and diagnosing that is almost impossible for me to do).
UPDATE: oops - sorry - you should be able to run
pilon --threads 1 --vcf --changes --fix snps,indels --minqual 10 --mindepth 5 --genome /home/lnunez/nas5/UCE/spades_assemblies/contigs/Adelophis_foxi_LSUMZ_H8263.contigs.fasta --bam bams/Adelophis_foxi_LSUMZ_H8263.0.bam --outdir fastas --output Adelophis_foxi_LSUMZ_H8263.0
outside of phyluce.
I actually see another thing that could cause a problem. See if you can try:
pilon -Xmx100g --threads 1 --vcf --changes --fix snps,indels --minqual 10 --mindepth 5 --genome /home/lnunez/nas5/UCE/spades_assemblies/contigs/Adelophis_foxi_LSUMZ_H8263.contigs.fasta --bam bams/Adelophis_foxi_LSUMZ_H8263.0.bam --outdir fastas --output Adelophis_foxi_LSUMZ_H8263.0
Alternatively, activate the phyluce environment, then type which pilon
, then open that path with a text editor and edit the line:
default_jvm_mem_opts = ['-Xms512m', '-Xmx1g']
to read
default_jvm_mem_opts = ['-Xms512m', '-Xmx100g']
Ok, I changed the default_jvm_mem_opts in pilon and that seems to have fixed the issue. Thank you very much for the help!
Super. I'll see if I can add an easier way to configure this to phyluce.
For what it's worth, I ran into the same issue and increasing the RAM allocated to pilon resolved the issue for me too. Thanks as always for clear solutions to our problems, Brant!
Instead of altering the pilon wrapper, we've been setting the Java memory options by setting the variable _JAVA_OPTIONS
before we call phyluce_workflow. It seems to work 😸
export _JAVA_OPTIONS="-Xms1024m -Xmx55g"
phyluce_workflow --config config_file_phasing.conf \
--output phasing_all \
--workflow phasing
Hello,
I'm having an issue using phyluce_workflow for phasing. The workflow will work fine until around 3/4 completion and it will get a java.lang.OutOfMemoryError like below:
After this error, the job will exit. It will still produce the bam files in the bam output directory, but not the fasta output.
Initially, I tried to resolve the issue by allocating more memory to the job and whittling down the number of samples, but I still get the same error consistently, so I feel like I've hit a wall. I'm running this job on an HPC cluster and the job is submitted through a PBS script.
Here is the log file in case it helps: phase_e.log