Open marcasriv opened 3 years ago
Hi Marina,
thank you for trying out PhyloCSF++ and opening an issue! I made a fix and pushed it to the master branch. Can you try running it again with the latest commit? Let me know if you need help building PhyloCSF++ from source, I can also upload a statically linked binary here.
If the fix works for you, we will make a new release, update it on bioconda and distribute new binaries.
Christopher
Hi Christopher,
Thanks so much for your help and fix! I re-built PhyloCSF++ with the latest commit and it is now running smoothly pass the error. Unfortunately I've bumped into a new problem. The program it's crashing now at (I believe) line 422 in script _phylocsf++annotate_withmmseqs.hpp (same parameters/files as in previous post):
mmseqs result2dnamsa conservation//cds/cds.index conservation//genomesDB/genbankseqs /conservation//aln/aln_all_tophit conservation//aln/msa --threads _40
MMseqs Version: 42bf6438fec1e1b987f46d8f6d4b09926ecfc019 Skip query false Threads 40 Compressed 0 Verbosity 3 Query database size: 99405 type: Nucleotide Target database size: 410501 type: Nucleotide [=================================================================] 100.00% 99.40K 7m 13s 889ms Time for merging to msa: 0h 0m 0s 216ms Time for processing: 0h 7m 15s 116ms MMseqs2: Score aligned CDS ...
terminate called after throwing an instance of 'std::length_error' terminate called recursively terminate called recursively terminate called recursively terminate called recursively terminate called recursively terminate called recursively terminate called recursively terminate called recursively terminate called recursively what(): terminate called recursively terminate called recursively terminate called recursively Aborted (core dumped)
Thanks again,
Marina
Can you give me the list of assemblies you used, so that we can try to reproduce this error?
Hi Christopher,
Sorry for the late reply. This is the list of fasta files I use:
https://hgdownload.soe.ucsc.edu/goldenPath/criGri1/bigZips/criGri1.fa.gz https://hgdownload.soe.ucsc.edu/goldenPath/mm39/bigZips/mm39.fa.gz https://hgdownload.soe.ucsc.edu/goldenPath/rn6/bigZips/rn6.fa.gz https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz https://hgdownload.soe.ucsc.edu/goldenPath/hetGla2/bigZips/hetGla2.fa.gz https://hgdownload.soe.ucsc.edu/goldenPath/cavPor3/bigZips/cavPor3.fa.gz https://hgdownload.soe.ucsc.edu/goldenPath/speTri2/bigZips/speTri2.fa.gz https://hgdownload.soe.ucsc.edu/goldenPath/oryCun2/bigZips/oryCun2.fa.gz https://hgdownload.soe.ucsc.edu/goldenPath/ochPri3/bigZips/ochPri3.fa.gz
and reference GTF:
https://hgdownload.soe.ucsc.edu/goldenPath/criGri1/bigZips/genes/criGri1.refGene.gtf.gz
Thanks,
Marina
Hi Marina,
thank you, we were able to reproduce the error and added a fix to the master branch. Before you run it again, please make sure to delete any temporary files in the output directory from the previous runs.
Christopher
Hi Christopher,
Thanks so much for your reply. I've removed the previous installation of PhyloCSF++ , cloned the latest PhyloCSF++ version and re-installed, and removed any previous files but I'm still getting the same error in the same line of code. I've also tried to change the location of the output directory , but unfortunately no luck so far. Could there be anything in my system overriding the new install?
Marina
Hi Marina,
I tried it on another system and it works for me with the latest commit and data set that you listed above. You don't have to "install" PhyloCSF++ on your system, after make
you can just call the binary directly in the build directory with ./phylocsf++
to make sure that you really use the latest build and not an outdated binary that might still be in the PATH.
Hi,
I'm interested in running PhyloCSF++ with annotate-with-mmseqs on Chinese hamster, but I am getting an error when it reaches the mmseqs createsubdb step:
./phylocsf++ annotate-with-mmseqs --threads 35 --output conservation species.txt 58mammals criGri1.refGene.gtf
This is how the input species.txt file looks like:
And I have downloaded the reference GTF file and fasta files from https://hgdownload.soe.ucsc.edu/goldenPath/criGri1/bigZips/genes/criGri1.refGene.gtf.gz and https://hgdownload.soe.ucsc.edu/goldenPath/criGri1/bigZips/criGri1.fa.gz
Thanks so much,
Marina