Closed mildlyhuman closed 2 years ago
Hi, we didn't publish the assemblies made in the manuscript, as they're of draft quality.
One thing we didn't make clear in the article, and sorry about that, it's that we didnt use magic_simplify
in metagenomes, but instead a modified version. Running https://github.com/ekimb/rust-mdbg/blob/master/utils/magic_simplify_meta instead of magic_simplify
should produce our results, modulo version differences.
It is a bit odd that you're getting '*'s instead of sequences. If it is still the case after using magic_simplify_meta, let us know.
Thank you very much for the clarification. The magic_simplify_meta
outputed sequences correctly and the graph looked better in Zymo D6331.
Hello,
I tried to replicate the Zymo D6331 assembly described in the mDBG manuscript. The assembly and the graph simplification finished successfully, but sequences were all placeholders. Somehow I can get correct output for the example case. I wonder what I did wrong.
A related question: I did not seem to find the assemblies from the manuscript. Could you please point me to them?
Thanks in advance; relevant info listed below.
expected result
Raw assembly graph, simplified assembly graph populated with sequences, and a fasta file of the simplified graph.
what did not work
Raw assembly graph is produced. Simplified graph is produced, but not sequences (all "*"). The fasta file also contain "*" instead of contig sequences.
environment
OS: Ubuntu Linux
mDBG installation:
mamba install -c bioconda rust-mdbg
, also git clone for graph simplification scripts.mDBG version: bioconda points to 1.0.1, but
rust-mdbg --version
printsrust-mdbg 0.1.0
. I am positive that there is only one rust-mdbg executable in my $PATH (by checking withcommand -v
).Other tools:
input data
Downloaded SRR13128014 then
seqtk seq -UA thereads.fq.gz | pigz -p32 - > input.fa.gz
.zcat input.fa.gz | head -4 | less -S
gives:the run
... which is stored in a file and executed via
. job.sh
. (I have also tried the-k 21 -l 14 --density 0.003
which was used for ATCC dataset.)output files
ls | grep fa$
gives:ls | grep asm2 | grep sequences | wc -l
gives: 48head -6 asm2.msimpl.gfa
gives:head -6 asm2.msimpl.fa
says:dbg_STDOUT says:
dbg_STDERR says: