We are making a hybrid assembly with MaSuRCa, we have launched it twice changing some parameters of the configuration file, both times it has stopped because of the same failure in the creation of the KUnitigs for the PE super reads.
We have been following the process in case it was a memory problem but it was not. Do you know why it could be?
Thanks in advance,
Daniel R.
I leave you the output information with the error (Our MaSuRCa version is v4.0.5):
[mié 25 may 2022 08:44:31 CEST] Processing pe library reads
[mié 25 may 2022 10:23:52 CEST] Average PE read length 144
[mié 25 may 2022 10:23:53 CEST] Using kmer size of 49 for the graph
[mié 25 may 2022 10:23:53 CEST] MIN_Q_CHAR: 33
[mié 25 may 2022 10:23:53 CEST] Creating mer database for Quorum
[mié 25 may 2022 12:19:18 CEST] Error correct PE
[mié 25 may 2022 20:41:29 CEST] Estimating genome size
[mié 25 may 2022 22:05:36 CEST] Estimated genome size: 12498287844
[mié 25 may 2022 22:05:36 CEST] Creating k-unitigs with k=49
[jue 26 may 2022 14:49:56 CEST] Computing super reads from PE
[jue 26 may 2022 14:49:56 CEST] Using CABOG from /home/soporte/BIOINFOR/softwares/masurca/MaSuRCA-4.0.7/bin/../CA8/Linux-amd64/bin
[jue 26 may 2022 14:49:56 CEST] Assembly stopped or failed, see .log
./assemble.sh: laine 125: 38232 Done create_k_unitigs_large_k -c $(($KMER-1)) -t 75 -m $KMER -n $(($ESTIMATED_GENOME_SIZE*2)) -l $KMER -f perl -e 'print 1/'$KMER'/1e5' pe.cor.fa
38233 | grep --text -v '^>'
38235 Segmentation fault (`core' generate) | perl -ane '{$seq=$F[0]; $F[0]=~tr/ACTGactg/TGACtgac/;$revseq=reverse($F[0]); $h{($seq ge $revseq)?$seq:$revseq}=1;}END{$n=0;foreach $k(keys %h){print ">",$n++," length:",length($k),"\n$k\n"}}' > guillaumeKUnitigsAtLeast32bases_all.fasta.tmp
[jue 26 may 2022 14:49:56 CEST] super reads file not found or size zero, you can try deleting work1 folder and re-generating assemble.sh, also check if guillaumeKUnitigsAtLeast32bases_all.fasta is not empty
I'm running into a similar error! I am using the latest version of MaSuRCA (4.1.1). The run fails at the creating k-unitigs step, and throws out a segmentation fault error.
@DRomeroPV, were you able to solve this?
Hello,
We are making a hybrid assembly with MaSuRCa, we have launched it twice changing some parameters of the configuration file, both times it has stopped because of the same failure in the creation of the KUnitigs for the PE super reads.
We have been following the process in case it was a memory problem but it was not. Do you know why it could be?
Thanks in advance, Daniel R.
I leave you the output information with the error (Our MaSuRCa version is v4.0.5):
[mié 25 may 2022 08:44:31 CEST] Processing pe library reads [mié 25 may 2022 10:23:52 CEST] Average PE read length 144 [mié 25 may 2022 10:23:53 CEST] Using kmer size of 49 for the graph [mié 25 may 2022 10:23:53 CEST] MIN_Q_CHAR: 33 [mié 25 may 2022 10:23:53 CEST] Creating mer database for Quorum [mié 25 may 2022 12:19:18 CEST] Error correct PE [mié 25 may 2022 20:41:29 CEST] Estimating genome size [mié 25 may 2022 22:05:36 CEST] Estimated genome size: 12498287844 [mié 25 may 2022 22:05:36 CEST] Creating k-unitigs with k=49 [jue 26 may 2022 14:49:56 CEST] Computing super reads from PE [jue 26 may 2022 14:49:56 CEST] Using CABOG from /home/soporte/BIOINFOR/softwares/masurca/MaSuRCA-4.0.7/bin/../CA8/Linux-amd64/bin [jue 26 may 2022 14:49:56 CEST] Assembly stopped or failed, see .log
./assemble.sh: laine 125: 38232 Done create_k_unitigs_large_k -c $(($KMER-1)) -t 75 -m $KMER -n $(($ESTIMATED_GENOME_SIZE*2)) -l $KMER -f
perl -e 'print 1/'$KMER'/1e5'
pe.cor.fa 38233 | grep --text -v '^>' 38235 Segmentation fault (`core' generate) | perl -ane '{$seq=$F[0]; $F[0]=~tr/ACTGactg/TGACtgac/;$revseq=reverse($F[0]); $h{($seq ge $revseq)?$seq:$revseq}=1;}END{$n=0;foreach $k(keys %h){print ">",$n++," length:",length($k),"\n$k\n"}}' > guillaumeKUnitigsAtLeast32bases_all.fasta.tmp [jue 26 may 2022 14:49:56 CEST] super reads file not found or size zero, you can try deleting work1 folder and re-generating assemble.sh, also check if guillaumeKUnitigsAtLeast32bases_all.fasta is not empty