Open avilella opened 11 months ago
Hi,
We've curated a region of about 2Mb of a mammalian genome but there is still a gap in it, of unknown length, which could be 200kb or longer.
The input current assembly is a single fasta sequence with a string on NNNs where the gap is located.
We have PacBio HiFi reads for the genome, and I am attempting to use gapless to fill in this gap in this locus.
gapless
I have 10 fastq.gz files, so I created a named pipe so it looks like there is only 1 fastq file:
mkfifo my_named_pipe zcat file1.gz file2.gz ... fileN.gz > yourfile.txt < my_named_pipe cat file1.gz > my_named_pipe cat file2.gz > my_named_pipe # Continue with other files...
Then I am running gapless like this:
bash -x gapless.sh -j 30 -i Cat_IgH_20230613_AM.fa.gz -t pb_hifi input.fastq
It fails at the extend step:
extend
++ seqtk subseq /data2/assembly_work/input.fastq pass1/gapless_extending_reads.lst + minimap2 -t 30 -x asm20 --min-occ-floor=0 -X -m100 -g10000 --max-chain-skip 25 /dev/fd/63 /dev/fd/62 ++ seqtk subseq /data2/assembly_work/input.fastq pass1/gapless_extending_reads.lst + gapless.py extend -p pass1/gapless pass1/gapless_extending_reads.paf + '[' -f pass1/gapless_extending_reads.paf ']' + rm -f pass1/gapless_extended_scaffold_paths.csv + '[' '!' -f pass1/gapless_extended_scaffold_paths.csv ']' + echo 'pipeline crashed: extend' pipeline crashed: extend + exit 1
What parameters could I tune to obtain a decent result? Thanks.
Hi,
We've curated a region of about 2Mb of a mammalian genome but there is still a gap in it, of unknown length, which could be 200kb or longer.
The input current assembly is a single fasta sequence with a string on NNNs where the gap is located.
We have PacBio HiFi reads for the genome, and I am attempting to use
gapless
to fill in this gap in this locus.I have 10 fastq.gz files, so I created a named pipe so it looks like there is only 1 fastq file:
Then I am running
gapless
like this:It fails at the
extend
step:What parameters could I tune to obtain a decent result? Thanks.