soedinglab / plass

sensitive and precise assembly of short sequencing reads
https://plass.mmseqs.com
GNU General Public License v3.0
149 stars 14 forks source link

Final output file size 0 #22

Open taylorreiter opened 4 years ago

taylorreiter commented 4 years ago

Expected Behavior

I expect plass to output a fasta with amino acid sequences

Current Behavior

plass runs, but outputs a file with no amino acid sequences

Steps to Reproduce (for bugs)

Please make sure to execute the reproduction steps with newly recreated and empty tmp folders.

wget -O SRS476121_69.fna.cdbg_ids.reads.fa.gz https://osf.io/p7fqc/download
plass assemble SRS476121_69.fna.cdbg_ids.reads.fa.gz SRS476121_69.cdbg_ids.reads.plass.faa tmp

Plass Output (for bugs)

Log file: 11388349399477705273_log.txt File sizes in tmp for plass run: 11388349399477705273_file_sizes.txt

Context

I am assembling reads that I think are derived from a single organism from a metagenome (e.g. reads from a spacegraphcats query). The reads are 101 bases long. The read file is 2.2GB, and I am treating it as single end.

Your Environment

I ran plass using conda, with the following environment:

channels:
   - conda-forge
   - bioconda
   - defaults
dependencies:
   - plass=3.764a3
   - cd-hit=4.8.1
   - paladin=1.4.6
   - samtools=1.10
   - salmon=0.15.0

I am on a linux computer, and used plass with 128 gb of ram and 8 CPU (Ubuntu 18.04.4 LTS (GNU/Linux 4.15.0-70-generic x86_64))

milot-mirdita commented 4 years ago

Could you try to decrease the minimum translated ORF length with the --min-length parameter? Something between 25 to 30 should work fine. The default translated fragment length of 45 is too long to fit into the 101 bp long reads.

Update: I tried it out locally, the aa_6f_long database has a reasonable size (instead of 0) if I pass a shorter min-length.

We should handle this case somehow better :/

By the way, if you want a set of stickers (see https://twitter.com/thesteinegger/status/1201076220957315074), send me your address to milot at mirdita de.

taylorreiter commented 4 years ago

thank you so much for the quick response! I'll give this a try and report back.

Just saw your update -- thank you for testing this out!

@luizirber received two sticker sets and gave one to me since he knows I'm a plass enthusiast. Thank you!