Closed christinafliege closed 4 years ago
Can you post the config file for the run?
The config file is the same as the one you helped us with before in the input field, however the data set changed. The example data I was working with was too small and failing at some other steps
/projects/mgc/Project_2/HLHS_BasilAniseVC/Pamir
raw-data:
/inputs
reference:
/projects/bioinformatics/DataPacks/human/gatk_bundle_Oct_2017/gatk_bundle_hg38/Homo_sapiens_assembly38.fasta
population:
population
input:
"003-HLH-001_all_lanes_merged":
- 003-HLH-001_all_lanes_merged.sorted.realigned.bam
"003-HLH-003_all_lanes_merged":
- 003-HLH-003_all_lanes_merged.sorted.realigned.bam
"003-HLH-004_all_lanes_merged":
- 003-HLH-004_all_lanes_merged.sorted.realigned.bam
centromeres:
/projects/mgc/Project_2/HLHS_BasilAniseVC/Pamir/inputs/centro.meres
analysis-base:
analysis
Can you try the small example given in the README and tell us if it finishes successfully?
curl -L https://ndownloader.figshare.com/files/22813988 --output example.tar.gz
tar xzvf example.tar.gz
cd example
chmod +x configure.sh
./configure.sh
pamir.sh -j16 --configfile config.yaml
This will help us determine root cause of the problem.
Thanks
Thank you.
When running the data set as shown above I get the following error.
Error in rule minia_all:
jobid: 87
output: /projects/mgc/Project_2/HLHS_BasilAniseVC/Pamir/inputs/example/analysis/small-pop/002-minia/contigs.fasta
shell:
cd /projects/mgc/Project_2/HLHS_BasilAniseVC/Pamir/inputs/example/analysis/small-pop/002-minia/ && minia -verbose 0 -in /projects/mgc/Project_2/HLHS_BasilAniseVC/Pamir/inputs/example/analysis/small-pop/002-minia/reads.fofn -kmer-size 64 -abundance-min 5 -max-memory 250000 -nb-cores 16 && mv reads.contigs.fa contigs.fasta
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /usr/local/apps/bioapps/pamir/pamir-2.1.0/pamir/.snakemake/log/2020-06-02T095928.686689.snakemake.log
It is a bug with setting maximum memory for minia. @joshfactorial can you switch to the master branch and have a clean installation of the pamir and run the small example again?
I got this message after installing:
$ pamir.sh
Running Pamir (f49e0b8)
snakemake(5.17.0)... Failed, requiring == 5.9.1
We check the version requirements from our conda environment.yaml. Some of these requirements are strictly required for the intended behaviour of pamir. For Snakemake it should also work on future versions. Temporarily can you update snakemake line in the environment.yaml to ">=" and make clean and make again, it will pass that check.
If you are interested, we can set up a zoom call to help you setup the pamir.
@christinafliege and @joshfactorial, Any updates from your end?
The fix above worked. We're just running into trouble at the moment installing RepeatMasker on our cluster. I'm trying to work through some of the perl issues. Once we get that up and running, I can let you know how if we hit any other Pamir issues.
Okay, I think RepeatMasker installed correctly, however, when I run pamir, this is the result:
$ pamir.sh
Running Pamir (f49e0b8)
snakemake(5.17.0)... OK
samtools(1.9)... OK
bedtools(2.29.2)... OK
mrsfast(3.4.1)... OK
bwa(0.7.17)... OK
repeatmasker(4.1.0)... OK, Version not checked.
where it simply hangs.
did you provide the cluster config while running pamir? Like following
pamir.sh --configfile [Config-Path] -j [Number-Of-Cores]
Okay, so we've run into a number of problems running it with a config.
Processing partitions between 1 and 286 with 15 threads
terminate called after throwing an instance of 'std::bad_alloc'
terminate called after throwing an instance of 'terminate called after throwing an instance of ' what(): terminate called after throwing an instance of 'terminate called after throwing an instance of 'std::bad_allocterminate called after throwing an instance of 'std::bad_allocterminate called after throwing an instance of 'std::bad_allocstd::bad_allocstd::bad_alloc'
'
terminate called after throwing an instance of 'std::bad_alloc'
'
'
std::bad_alloc what(): what(): std::bad_alloc'
terminate called after throwing an instance of ' what(): what(): std::bad_alloc what(): std::bad_alloc'
std::bad_allocstd::bad_allocstd::bad_alloc what(): std::bad_alloc
Attached is the full log LudasLog.txt
A quick update. The bwa issue (2) turned out to be a problem on our end.
Does bad_alloc error (3) still persist?
As far as that goes, we think we have a way to run it, but we need to do a little more testing. Basically, we have to copy out pamir.sh to the run folder, delete the -d option that tries to create the log within the installation directory, and run it there. @christinafliege is going to try to do that today (or soon anyway) at some point, with the correct configuration files and hopefully get it working. We did have a successful completion using this method over the weekend on some test data.
After copying out pamir.sh to a run folder and editing it removing the -d option. It successfully ran on the example data for multiple users!
However, when running on our input data we are receiving the same error as listed above. Although this time the error log says that it is 13% done instead of 5%. The SAM file that it is creating does not look like it is truncated. Elsewhere in the error log it states that "Big Queue is not cleared", although we are uncertain if that is necessary. The config file for generating this error is the same as the original config file above. I have restarted the job and it picked up where it left off but generated the same error.
Thank you!
[W::sam_read1] Parse error at line 25369458
samtools sort: truncated file. Aborting
[Tue Jun 9 09:15:11 2020]
Error in rule sam_sort:
jobid: 93
output: /projects/mgc/Project_2/HLHS_BasilAniseVC/Pamiranalysis/population/005-pamir-oea-processing/003-HLH-004_all_lanes_merged/003-HLH-004_all_lanes_merged.anchor.sorted.sam
shell:
samtools sort /projects/mgc/Project_2/HLHS_BasilAniseVC/Pamiranalysis/population/005-pamir-oea-processing/003-HLH-004_all_lanes_merged/003-HLH-004_all_lanes_merged.anchor.sam -m 8G -@ 1 -O SAM -o /projects/mgc/Project_2/HLHS_BasilAniseVC/Pamiranalysis/population/005-pamir-oea-processing/003-HLH-004_all_lanes_merged/003-HLH-004_all_lanes_merged.anchor.sorted.sam
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
additionally. we are using -j16 for our cluster configuration in the qsub script and running on a queue with 2 * Intel Xeon E5 2690 v3 CPUs (24 cores/node) and 256 GB of RAM
Thanks!
Can you run so we can see what is wrong with that line? Thank you.
cat /projects/mgc/Project_2/HLHS_BasilAniseVC/Pamiranalysis/population/005-pamir-oea-processing/003-HLH-004_all_lanes_merged/003-HLH-004_all_lanes_merged.anchor.sam | head -n 25369458 | tail -n1
Here is the output. Thanks!
R0230412_0160:8:1312:18333:50748#0/1 16 HLA-B*27:24 HLA01504 796 255 100M * 0 0CTACGATGGCAAGGATTACATCGCCCTGAACGAGGACCTGCACTCCTGGACCGCCGCGAACACAGCGGCTCAGATCTCCCAGCACAAGTGGGAAGCGGAC ?22BB@4>:>4((DDA@8@BB<<8+(??82@AA@A8BBAB<3???>>9;3;/8A4;(BD?D?6EFFFECGG>GFIGAFIEEHC:@GBFBDFF@:@A?8:= NM:i:7 MD:Z:6C34G16G4G19G9G4C1
@christinafliege have you installed mrsfast throug bioconda or directly from github?
@joshfactorial can you answer this?
We installed it directly from github.
@joshfactorial please update the mrsfast version to latest v3.4.2 from github. This will fix the issue from samtools truncation error. You also need to re-build the mrsfast index after updating the mrsfast version. You should update the version in eviroment.yaml to 3.4.2
so the script can pass the version check. As always, please make clean && make
@joshfactorial Most of these changes (multiple user snakemake; version updates) are reflected in the current master. I would encourage you to do a full clone.
@christinafliege after @joshfactorial does the updates, you need to either remove /projects/mgc/Project_2/HLHS_BasilAniseVC/Pamiranalysis/population/005-pamir-oea-processing/
or full analysis folder. Since there will be index change, I am leaning to suggest you to remove the full analysis folder.
@christinafliege If everything goes smoothly, we can close this thread.
@fhach, after @joshfactorial did the reinstall, I deleted the entire PamirAnalysis folder and started the job up again. It ran for 15 hours before erroring with the same message.
samtools sort: truncated file. Aborting
[Fri Jun 12 03:31:19 2020]
Error in rule sam_sort:
jobid: 93
output: /projects/mgc/Project_2/HLHS_BasilAniseVC/Pamiranalysis/population/005-pamir-oea-processing/003-HLH-004_all_lanes_merged/003-HLH-004_all_lanes_merged.anchor.sorted.sam
shell:
samtools sort /projects/mgc/Project_2/HLHS_BasilAniseVC/Pamiranalysis/population/005-pamir-oea-processing/003-HLH-004_all_lanes_merged/003-HLH-004_all_lanes_merged.anchor.sam -m 8G -@ 1 -O SAM -o /projects/mgc/Project_2/HLHS_BasilAniseVC/Pamiranalysis/population/005-pamir-oea-processing/003-HLH-004_all_lanes_merged/003-HLH-004_all_lanes_merged.anchor.sorted.sam
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
My initial guess would be that you have not rebuilt the mrsfast index after obtaining v3.4.2.
Regardless, that truncation error should have a line number. Can you use @f0t1h line to extract that line from the sam file?
I'm sorry somehow that didn't get copied out of the error file
[W::sam_read1] Parse error at line 25369466
Okay, just so we're clear we need to run this command:
./mrsfast --index genome.fa
?
cat /projects/mgc/Project_2/HLHS_BasilAniseVC/Pamiranalysis/population/005-pamir-oea-processing/003-HLH-004_all_lanes_merged/003-HLH-004_all_lanes_merged.anchor.sam | head -n 25369458 | tail -n1
R0230412_0160:5:2110:4957:26090#0/1 16 chrUn_JTFH01001998v1_decoy 955 255 100M * 0 0 TATAATACATGCTTTGGGTACTTTGATATTTTTTGTACAGTATAGAATATATACCTTGGGTACTTTGATATTTTATGTACAGTATATAATATATAGTTTG EEFEFECFFFFFHHHHHIHEJJJIJJJGGJJJJJJJIJJJJJJJJJIJIIHIGJJIIJIHGHIIJJIIJJJJIIIJICJJJJJJJJJHHHHHFFFFFCCC NM:i:7 MD:Z:10A21A4G11C4T36C3C4
@christinafliege in head -n
, replace the number with 25369466.
@joshfactorial, yes, that is the correct.
cat /projects/mgc/Project_2/HLHS_BasilAniseVC/Pamiranalysis/population/005-pamir-oea-processing/003-HLH-004_all_lanes_merged/003-HLH-004_all_lanes_merged.anchor.sam | head -n 25369466 | tail -n1
R0230412_0160:8:1312:18333:50748#0/1 16 HLA-B*27:24 HLA01504 796 255 100M * 0 0 CTACGATGGCAAGGATTACATCGCCCTGAACGAGGACCTGCACTCCTGGACCGCCGCGAACACAGCGGCTCAGATCTCCCAGCACAAGTGGGAAGCGGAC ?22BB@4>:>4((DDA@8@BB<<8+(??82@AA@A8BBAB<3???>>9;3;/8A4;(BD?D?6EFFFECGG>GFIGAFIEEHC:@GBFBDFF@:@A?8:= NM:i:7 MD:Z:6C34G16G4G19G9G4C1
@christinafliege index should be rebuilt. The issue is coming from comments in reference fasta that has been seperated by TAB
, mrsfast's new index should take care of the comments seprated by TAB
.
When @joshfactorial is finished indexing, please delete foldiers starting with 005, 006, ... 012. Only keep the folders starting with 001,002,003,004 and you can rerun. It will resume from stage 005.
We're re-running now.
While still running, it looks like this fixed the current error. Thanks!
@christinafliege if you have a complete run on your data, would you please close this issue.
the data is still running. Currently at "rule pamir_assemble_full_new:" for the past 46 hours! I will close it when it is complete! Thanks! :)
Good Afternoon,
Pamir has started running on this project, and I get 5% through before it aborts with the following error. I have run that shell command separately and receive the same error. The file that is created by the previous step does not appear to be actually truncated. Thank you!