schmeing / gapless

Gapless provides combined scaffolding, gap-closing and assembly correction with long reads
MIT License
32 stars 4 forks source link

pipeline crashed: extend #13

Open sogriffin98 opened 11 months ago

sogriffin98 commented 11 months ago

Hello,

I had assembled my genome using Illumina and PacBio reads and then I wanted to gap fill using Oxford Nanopore.

I was trying to run the following: gapless.sh -j 30 -i HybridSPAdes_PacBioIllumina_contigs.fasta -t nanopore MF1.fastq.gz

In the extend log I got this error message: 0:00:09.482191 Preparing data from files 0:00:12.074014 Searching for extensions Traceback (most recent call last): File "/home/sgriffin/miniconda3/envs/gapless/bin/gapless.py", line 13327, in main(sys.argv[1:]) File "/home/sgriffin/miniconda3/envs/gapless/bin/gapless.py", line 13193, in main GaplessExtend(args[0], prefix, min_length_contig_break) File "/home/sgriffin/miniconda3/envs/gapless/bin/gapless.py", line 9610, in GaplessExtend scaffold_paths, polishing_reads, extension_info, gap_scaffolds = ExtendScaffolds(scaffold_paths, polishing_reads, extensions, hap_merger, new_scaffolds, mappings, min_num_reads, max_mapping_uncertainty, min_scaf_len, ploidy, polishing_coverage) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sgriffin/miniconda3/envs/gapless/bin/gapless.py", line 9515, in ExtendScaffolds scaffold_paths = scaffold_paths.append( extending_reads[['scaf','pos','type']+[f'{n}{h}' for h in range(ploidy) for n in ['phase','name','start','end','strand']]+['sdist_left','sdist_right']] ) ^^^^^^^^^^^^^^^^^^^^^ File "/home/sgriffin/miniconda3/envs/gapless/lib/python3.11/site-packages/pandas/core/generic.py", line 6204, in getattr return object.getattribute(self, name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'DataFrame' object has no attribute 'append'. Did you mean: '_append'?

Not sure if you're able to help?

YocelynG commented 10 months ago

Hi, it looks like some files did not generated in the previous steps. I fixed it by running the pipeline step by step with some modifications:

gapless.py split -o gapless_split.fa assembly_hifiasm_ctg.fasta minimap2 -t 30 -DP -k19 -w19 -m200 gapless_split.fa gapless_split.fa > gapless_split_repeats.paf minimap2 -t 30 -x map-hifi -c -N 5 --secondary=no gapless_split.fa hifi_reads.default.filt.fastq.gz > gapless_reads.paf gapless.py scaffold -p gapless -s gapless_stats.pdf gapless_split.fa gapless_reads.paf gapless_split_repeats.paf minimap2 -t 30 -x map-hifi <(seqtk subseq hifi_reads.default.filt.fastq.gz gapless_extending_reads.lst) <(seqtk subseq hifi_reads.default.filt.fastq.gz gapless_extending_reads.lst) > gapless_extending_reads.paf gapless.py extend -p gapless gapless_extending_reads.paf seqtk subseq hifi_reads.default.filt.fastq.gz gapless_used_reads.lst > temp_finish.fastq gapless.py finish -o gapless_raw.fa -H 0 -s gapless_extended_scaffold_paths.csv -p gapless_polishing.csv gapless_split.fa temp_finish.fastq minimap2 -t 30 -x map-hifi gapless_raw.fa hifi_reads.default.filt.fastq.gz > gapless_consensus.paf racon -t 30 hifi_reads.default.filt.fastq.gz gapless_consensus.paf gapless_raw.fa > gapless.fa