GATB / gatb-minia-pipeline

GATB Minia assembly pipeline
29 stars 8 forks source link

BESST: error: argument -orientation is required #31

Open Inexperiencedresearcher opened 3 years ago

Inexperiencedresearcher commented 3 years ago

Hello!

I am trying to use GATB-minia-pipeline to assemble a genome. Because my storage is limited, I am not doing a multi-k but instead a k=41.

the assembly with minia works without issues, however when performing a scaffold with BESST I ran into the following issue:

(2021-08-10 11:41:36) Finished Multi-k GATB-Pipeline at k=41

(2021-08-10 11:41:38) Execution of 'python BESST/scripts/reads_to_ctg_map.py'. Command line: python /home/sp1615/gatb-minia-pipeline/BESST/scripts/reads_to_ctg_map.py --tmp_path /home/sp1615/Desktop/T51_genome_assembly/BESST_tmp --threads 40 T51_S8_R1_001.fastq.gz T51_S8_R2_001.fastq.gz assembly_k41.contigs.fa assembly.lib_0

pe1_path: T51_S8_R1_001.fastq.gz pe2_path: T51_S8_R2_001.fastq.gz genome_path: assembly_k41.contigs.fa output_path: assembly.lib_0 tmp_path: /home/sp1615/Desktop/T51_genome_assembly/BESST_tmp bwa path: bwa number of threads: 40 Remove temp SAM and BAM files: No Use bwa aln and sampe instead of bwa mem: No Do not rebuild bwa index if already exists in tmp dir: No

Start processing.

Aligning with bwa mem. Temp directory: /home/sp1615/Desktop/T51_genome_assembly/BESST_tmp Output path: assembly.lib_0 Stderr file: assembly.lib_0.bwa.1 Make bwa index... Done. Align with bwa mem... Done. Time elapsed for bwa index and mem: 5:43:13.571824

Convert SAM to BAM... Done. Time elapsed for SAM to BAM conversion: 2:09:30.058622

Sort BAM... Done. Time elapsed for BAM sorting: 4:17:52.790747

Index BAM... Done. Time elapsed for BAM indexing: 0:14:52.160132

Remove temp files... Done. Time elapsed for temp files removing: 0:00:16.807376

Processing is finished. (2021-08-11 00:07:30) Execution of 'python BESST/runBESST'. Command line: /home/sp1615/gatb-minia-pipeline/tools/memused python /home/sp1615/gatb-minia-pipeline/BESST/runBESST -c assembly_k41.contigs.fa -f assembly.lib_0.bam -o assembly_besst --orientation fr --iter 10000 usage: BESST [-h] -c CONTIGFILE -f BAMFILES [BAMFILES ...] -orientation ORIENTATION [ORIENTATION ...] [-r READLEN [READLEN ...]] [-m MEAN [MEAN ...]] [-s STDDEV [STDDEV ...]] [-z COVCUTOFF] [-z_min LOWER_COVCUTOFF] [-T THRESHOLD [THRESHOLD ...]] [-e EDGESUPPORT [EDGESUPPORT ...]] [-k MINSIZE [MINSIZE ...]] [-filter_contigs CONTIG_FILTER_LENGTH] [--min_mapq MIN_MAPQ] [--iter PATH_THRESHOLD] [--score_cutoff SCORE_CUTOFF] [--max_extensions MAX_EXTENSIONS] [-a HAPLRATIO] [-b HAPLTHRESHOLD] [-K KMER] [-M MMER] [-g] [-o OUTPUT] [-d] [-y] [-q] [--no_score] [-devel] [-plots] [--separate_repeats] [--NO_ILP] [--FASTER_ILP] [--print_scores] [--dfs_traversal] [--bfs_traversal] [-max_contig_overlap MAX_CONTIG_OVERLAP] [--version] BESST: error: argument -orientation is required maximal memory used: 6 MB (2021-08-11 00:07:31) Execution of 'python BESST/runBESST' failed. Command line: /home/sp1615/gatb-minia-pipeline/tools/memused python /home/sp1615/gatb-minia-pipeline/BESST/runBESST -c assembly_k41.contigs.fa -f assembly.lib_0.bam -o assembly_besst --orientation fr --iter 10000

I noticed here https://github.com/GATB/gatb-minia-pipeline/issues/25 that there seems to be an issue with how the gatb script calls BESST which can be fixed by changing line 469 from -orientation to --orientation.

however, this is the gatb script version I have at line 469:

cmd = ['-c', contigs, '-f'] + bam_files + ['-o', prefix + '_besst'] + ['--orientation'] + orientations + ['--iter', besst_iter]

which already has the change from -orientation to --orientation.

should I change it back to -orientation?

thank you for your help! S

rchikhi commented 3 years ago

hi, yes please try this change, but also if you have only paired-end reads, you could also just skip scaffolding or pre-merge pairs if they're overlapping. Assembly result shouldn't be drastically different.

rchikhi commented 3 years ago

your assembly is already in the file assembly_k41.contigs.fa and you can consider it to be the final result of the pipeline (without scaffolding)

Inexperiencedresearcher commented 3 years ago

Thank you for your reply, I have tried changing it to -orientation and it works.

also, I tried multi-k assembly + scaffolding (I managed to increase my storage) and it all works, but I have yet to check whether the scaffolding with BESST has improved the asssemblies at various k values.