alekseyzimin / masurca

GNU General Public License v3.0
245 stars 35 forks source link

error extracting reads for scaffolding #244

Open gitcruz opened 3 years ago

gitcruz commented 3 years ago

Dear Aleksey,

I have run the latest version on the grid (using slurm) to create the megarreads. After the create_megarreads job array finished I simply rerun ./assemble.sh and I obtained a flye assembly with megareads coverage 21x ). However, the post-assembly scaffolding step failed with this error:

_what(): basic_ios::clear /home/devel/fcruz/bin/programs/MaSuRCA-4.0.4/bin/masurcascaffold.sh: line 112: 13973 Aborted $MYPATH/ufasta extract -f <(awk '{print $NF}' $REFN.$QRYN.coords) $QRY > $REFN.$QRYN.reads.fa.tmp

I think the reason is because the input raw reads are given in fastq format instead of fasta.

_Usage: masurcascaffold.sh -r -q -t -m <minimum matching length, default:5000> -o <maximum overhang, default:1000>

Are you contemplating to adjust this script to also accept fastq and fastq.gz ?

For now, i will convert the original input raw reads to fasta and rerun the scaffolding script. I guess this is the right thing to do, using the megareads for scaffolding does not seem right.

Thanks, Fernando

alekseyzimin commented 3 years ago

The input files for masurca scaffolder must be fasta. Good point, I will add conversion of fastq to fasta automatically in the future version.

gitcruz commented 3 years ago

Hi Aleksey,

Adding it would be nice because it will more flexible.

By the way, any hints about how to solve the problem with find_repeats.pl? it was raised here issue #242 masurca_scaffold.sh error

Thanks, Fernando