VGP / vgp-assembly

VGP repository for the genome assembly working group
Other
185 stars 51 forks source link

mitovgp trimmer failing #55

Closed Astahlke closed 3 years ago

Astahlke commented 3 years ago

Hello!

Looking for help resolving the trimmer module of the mitovgp pipeline. The first trimmer script is not completing. From what I can tell there's no error generated. It appears that mummer successfully runs twice, but the expected ${FNAME}_polish2_10x1_trim1.fasta isn't generated at the end for map10x2 to proceed. Any suggestions to resolve this?

Here's the relevant standard output:

BEGIN1: 5
BEGIN2: 18010
END1: 15213

1: PREPARING DATA
2,3: RUNNING mummer AND CREATING CLUSTERS
# reading input file "Pectinophora_gossypiella_2/Pgos/assembly_MT_rockefeller/intermediate
s/trimmed/Pgos.tig00000001_polish2_10x1.ntref" of length 18007
# construct suffix tree for sequence of length 18007
# (maximum reference length is 536870908)
# (maximum query length is 4294967295)
# CONSTRUCTIONTIME /home/amanda.stahlke/.conda/envs/mitoVGP_pacbio/opt/mummer-3.23/mummer
Pectinophora_gossypiella_2/Pgos/assembly_MT_rockefeller/intermediates/trimmed/Pgos.tig0000
0001_polish2_10x1.ntref 0.00
# reading input file "/90daydata/project/ag100pest/Pgos/RawData/MT_Contig/Pectinophora_gos
sypiella_2/Pgos/assembly_MT_rockefeller/intermediates/trimmed/Pgos.tig00000001_polish2_10x
1_new.fasta" of length 18006
# matching query-file "/90daydata/project/ag100pest/Pgos/RawData/MT_Contig/Pectinophora_go
ssypiella_2/Pgos/assembly_MT_rockefeller/intermediates/trimmed/Pgos.tig00000001_polish2_10
x1_new.fasta"
# against subject-file "Pectinophora_gossypiella_2/Pgos/assembly_MT_rockefeller/intermedia
tes/trimmed/Pgos.tig00000001_polish2_10x1.ntref"
# COMPLETETIME /home/amanda.stahlke/.conda/envs/mitoVGP_pacbio/opt/mummer-3.23/mummer Pect
inophora_gossypiella_2/Pgos/assembly_MT_rockefeller/intermediates/trimmed/Pgos.tig00000001
_polish2_10x1.ntref 0.01
# SPACE /home/amanda.stahlke/.conda/envs/mitoVGP_pacbio/opt/mummer-3.23/mummer Pectinophor
a_gossypiella_2/Pgos/assembly_MT_rockefeller/intermediates/trimmed/Pgos.tig00000001_polish
2_10x1.ntref 0.03
4: FINISHING DATA

++++ running: map10x2 ++++

Species: -s Pectinophora_gossypiella_2

Species ID: -i Pgos

Contig number: -n tig00000001

Number of threads: -t 30

Working directory: Pectinophora_gossypiella_2/Pgos/assembly_MT_rockefeller/intermediates

--Generate sorted alignment:

Align...
Error: could not open Pectinophora_gossypiella_2/Pgos/assembly_MT_rockefeller/intermediate
s/trimmed/Pgos.tig00000001_polish2_10x1_trim1.fasta

I've attached a couple of other log files here - not sure what's most helpful. Thanks in advance for any ideas. I've posted this same issue at https://github.com/gf777/mitoVGP/issues/1

Pgos_mtDNApipe_20210624-174115.log Pgos_trimmer_20210624-222455.log R-mtpipe.6056486.err.log R-mtpipe.6056486.out.log

Amanda

gf777 commented 3 years ago

Hi Amanda,

can you generate a self alignment of the canu contig selected by mitoVGP (or attach the fasta from canu). Usually this is very informative of repeats and other structures that may cause issues to mummer. We can increase the processivity of mummer with the option -z (e.g. 5000 could be a good value), if the problem is just finding the overlap within a nasty repeat.

gf777 commented 3 years ago

I am closing since it's duplicated