marbl / parsnp

Parsnp was designed to align the core genome of hundreds to thousands of bacterial genomes within a few minutes to few hours. Input can be both draft assemblies and finished genomes, and output includes variant (SNP) calls, core genome phylogeny and multi-alignments. Parsnp leverages contextual information provided by multi-alignments surrounding SNP sites for filtration/cleaning, in addition to existing tools for recombination detection/filtration and phylogenetic reconstruction.
Other
128 stars 25 forks source link

parsnp v1.5 command failed: parsnp_core on parsnpAligner.ini #81

Closed bwanderson3 closed 4 years ago

bwanderson3 commented 4 years ago

I downloaded the latest compiled version of parsnp (https://github.com/marbl/parsnp/releases/download/v1.5.2/Parsnp-Linux64-v1.5.2.tar.gz) on a Windows subsystem Linux. When I execute:

$ ./parsnp -g /mnt/c/wsl/B_fragilis_genomes/GCF_000025985.1_ASM2598v1_genomic.gbff -d /mnt/c/wsl/B_fragilis_genomes/B_fragilis_fasta/ -c

I get:

`|--Parsnp 1.5.2--| For detailed documentation please see --> http://harvest.readthedocs.org/en/latest 16:16:53 - INFO -


SETTINGS: |-refgenome: /mnt/c/wsl/B_fragilis_genomes/GCF_000025985.1_ASM2598v1_genomic.gbff |-genomes: /mnt/c/wsl/B_fragilis_genomes/B_fragilis_fasta/GCF_000009925.1_ASM992v1_genomic.fna /mnt/c/wsl/B_fragilis_genomes/B_fragilis_fasta/GCF_000025985.1_ASM2598v1_genomic.fna ...162 more file(s)... /mnt/c/wsl/B_fragilis_genomes/B_fragilis_fasta/GCF_902374115.1_MGYG-HGUT-01337_genomic.fna /mnt/c/wsl/B_fragilis_genomes/B_fragilis_fasta/GCF_903181445.1_NZ_Bacteroides_fragilis_8E3_BL_hyb_genomic.fna |-aligner: muscle |-outdir: /mnt/c/wsl/src/parsnp_linux/P_2020_07_18_161652992197 |-OS: Linux |-threads: 1


16:16:53 - INFO - <> 16:16:58 - INFO - Running Parsnp multi-MUM search and libMUSCLE aligner... 16:16:58 - CRITICAL - The following command failed:

$ /mnt/c/wsl/src/parsnp_linux/bin/parsnp_core /mnt/c/wsl/src/parsnp_linux/P_2020_07_18_161652992197/parsnpAligner.ini Please veryify input data and restart Parsnp. If the problem persists please contact the Parsnp development team.

  STDOUT:

  STDERR:

`

I should note that I get the same error on parsnp v1.2. I also tried the conda compilation with identical results. This issue appears to be similar to the closed issues #35 and #59.

I have tried building from source, but I get a different issue during compiling where /usr/bin/ld cannot find lMUSCLE-3.7 (I used the updated directions for this: https://github.com/marbl/parsnp/blob/master/README.md/).

bkille commented 4 years ago

@bwanderson3 thanks for reporting this issue. Very odd that the script gets to the core parsnp binary then fails without any output. Parsnp does not currently support GenBank flat files (I'll have to look into why it got through the parsing step). My suggestion would be to convert to a .gbk file and see if the error persists.

bwanderson3 commented 4 years ago

@bkille Thanks for your quick response!

I tried a .gbk file with the same result. I also tried a .fna file for the reference genome (with the -r flag) with the same result.

I downloaded the test dataset and got the same output (see below). This seems to be specific to my machine.

|--Parsnp 1.5.2--| For detailed documentation please see --> http://harvest.readthedocs.org/en/latest 21:03:01 - INFO -


SETTINGS: |-refgenome: /mnt/c/wsl/mers_virus/ref/England1.gbk |-genomes: /mnt/c/wsl/mers_virus/genomes/Al-Hasa_12_2013.fna /mnt/c/wsl/mers_virus/genomes/Al-Hasa_15_2013.fna ...45 more file(s)... /mnt/c/wsl/mers_virus/genomes/Taif_1_2013.fna /mnt/c/wsl/mers_virus/genomes/Wadi-Ad-Dawasir_1_2013.fna |-aligner: muscle |-outdir: /mnt/c/wsl/src/parsnp_linux/P_2020_07_19_210301717448 |-OS: Linux |-threads: 1


21:03:01 - INFO - <> 21:03:02 - WARNING - Genome sequence /mnt/c/wsl/mers_virus/genomes/Jeddah_2014_C7149.fna seems to be aligned! Skip! 21:03:02 - WARNING - Genome sequence /mnt/c/wsl/mers_virus/genomes/Jeddah_2014_C7569.fna seems to be aligned! Skip! 21:03:02 - WARNING - Genome sequence /mnt/c/wsl/mers_virus/genomes/Jeddah_2014_C7770.fna seems to be aligned! Skip! 21:03:02 - INFO - Recruiting genomes... 21:03:02 - CRITICAL - The following command failed:

$ /mnt/c/wsl/src/parsnp_linux/bin/parsnp_core /mnt/c/wsl/src/parsnp_linux/P_2020_07_19_210301717448/all_mumi.ini Please veryify input data and restart Parsnp. If the problem persists please contact the Parsnp development team.

  STDOUT:

  STDERR:
bkille commented 4 years ago

Hmmm.. which Linux distribution is your WSL running on? Is it WSL 2?

bwanderson3 commented 4 years ago

It was WSL 1. I updated it to WSL 2, but this did not fix the error.

I also tried it on a different computer running WSL 2 with the same result. If it helps, the parsnp_core command fails regardless of the -c flag. Without the flag, it fails trying to modify all_mumi.ini

bkille commented 4 years ago

can you attach the output of a run that fails on the all_mumi writing?

bwanderson3 commented 4 years ago

This is one I tried with the conda parsnp (still v1.2)

|--Parsnp v1.2--| For detailed documentation please see --> http://harvest.readthedocs.org/en/latest


SETTINGS: |-refgenome: /mnt/c/wsl/mers_virus/ref/England1.fna |-aligner: libMUSCLE |-seqdir: /mnt/c/wsl/mers_virus/genomes/ |-outdir: /mnt/c/wsl/parsnp_linux/P_2020_07_20_115250705676 |-OS: Linux |-threads: 32


<>

-->Reading Genome (asm, fasta) files from /mnt/c/wsl/mers_virus/genomes/.. |->[OK] -->Reading Genbank file(s) for reference (.gbk) .. |->[WARNING]: no genbank file provided for reference annotations, skipping.. -->Calculating MUMi.. ERROR The following command failed:

/tmp/_MEIuQ5gHk/parsnp /mnt/c/wsl/parsnp_linux/P_2020_07_20_115250705676/all_mumi.ini Please veryify input data and restart Parsnp. If the problem persists please contact the Parsnp development team. ERROR

bkille commented 4 years ago

Can you try running the failed command and attach the output here? (Sorry, this should be done by parsnp, I'll add that in the next release).

/tmp/_MEIuQ5gHk/parsnp /mnt/c/wsl/parsnp_linux/P_2020_07_20_115250705676/all_mumi.ini

bwanderson3 commented 4 years ago

I understand now. The output is:

Segmentation fault

bkille commented 4 years ago

Which OS is your WSL running on?

bwanderson3 commented 4 years ago

I'm running Ubuntu 20.04 LTS (on Windows 10 build 19041).

bkille commented 4 years ago

@bwanderson3 After some debugging, it seems like the error has to do specifically with WSL. The binary works with native Ubuntu 20 and MacOS but we have been able to replicate your issue on WSL. We are going to try and rebuild the binary to be compatible with WSL, but in the meantime you may want to try building from source.

sunctx commented 4 years ago

Hi, I have come up with the same error when parsnp v1.2 was running on a native Linux OS (Linux version 2.6.32-642.el6.x86_64).

ERROR The following command failed:

/tmp/61195.1.all.q/_MEIWcirmK/parsnp /disk1/parsnpAligner.ini Please veryify input data and restart Parsnp. If the problem persists please contact the Parsnp development team. ERROR


SETTINGS: |-refgenome: autopick |-aligner: libMUSCLE |-seqdir: |-outdir:
|-OS: Linux |-threads: 16


bkille commented 4 years ago

@sunctx that looks like a CentOS distro of linux. I haven't tested on that but will try to make a binary that works for that (and WSL2). In the meantime, you can try building from source and see if that fixes the issue.

sunctx commented 4 years ago

@bkille thanks for your efforts.

sunctx commented 4 years ago

I understand now. The output is:

Segmentation fault

I am wondering if this result from "out of memory".

bkille commented 4 years ago

@sunctx If memory is an issue, running with 1 thread may help. You can also view memory usage using top or htop.

sunctx commented 4 years ago

Dear bkille, Thank you very much for the software. I am trying to parsnp ~ 1700 Klebsiella pneumoniae genomes on a server cluster (ten nodes, 16 threads and 126G RAM per node), and parsnp 1.2 always come up with #81 this error (no difficulties in genomes alignment of a small sample sizes, up to 500 genomes). When runing on ~ 1700 genomes, I found the RAM increased gradually to 126GB and the job was then killed (out of memory). Thus I added -P 35000000 (or -P 35000, or -P 90000, or -P 90, -P 35, −P 35GB), while each won’t stop the memory usage increase up to 126GB, which resulted the job killed and my #81 error.

So, i am appreciate if you could tell:

Thanks in advance and have a nice weekend.

/Sun

bkille commented 4 years ago

@bwanderson3 We have fixed the WSL issue in the most recent build of parsnp. The conda package is stilll waiting to be pulled, however you're welcome to try the packaged release

bwanderson3 commented 4 years ago

@bkille This build worked on the test dataset (mers_virus). Thank you very much!