MaestSi / MetONTIIME

A Meta-barcoding pipeline for analysing ONT data in QIIME2 framework
GNU General Public License v3.0
78 stars 17 forks source link

import fasta files #98

Closed anilchauhanhp9 closed 5 months ago

anilchauhanhp9 commented 5 months ago

Hi,

Thank you for making a pipeline for nanopore reads analysis.

I have some fasta sequence files from the nanopore sequencer that I want to analyse using the pipeline. I had nanopore amplicons fastq reads from which I made fasta assembly using fly assembler. Now I want to supply those fasta sequences in the MetONTIIME but I couldn't. If there is a way to do it, please show.

Thank you.

MaestSi commented 5 months ago

Hi, in case you have seqtk installed, the following command should work to obtain fastq.gz files with fake quality scores, that you can feed into MetONTIIME.

FASTA_DIR="/path/to/dir/with/flye/assemblies"
for f in $(find $FASTA_DIR | grep "\\.fasta$"); do
  sn=$(echo $(basename $f) | sed 's/\.fasta//');
  seqtk seq -F '#' $f | gzip  > $sn.fastq.gz;
done

Let me know if this works. Best, SM

anilchauhanhp9 commented 5 months ago

fin_libB2_fly_medaka_consensus.fastq.gz Hi,

Thank you for the reply.

I ran the command mentioned by you which was successful. However, there was a problem at importfastq step. The error is mentioned below:

ERROR ~ Error executing process > 'importFastq'

Caused by: Process importFastq terminated with an error exit status (1) Command error: There was a problem importing /home/leek/MetONTIIME/fin_libB2/importFastq/manifest.txt:

/tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-fjt2yagk/fin_libB2_fly_medaka_consensus_0_L001_R1_001.fastq.gz is not a(n) FastqGzFormat file:

Header on line 5 is not FASTQ, records may be misaligned. 

Please see and help.

MaestSi commented 5 months ago

Hi, please check that the fastq.gz file in \<resultsDir>/downsampleFastq is not empty and it contains all consensus sequences. As I put a fake quality score for all bases (#), you may need to set to 0 the minimum sequencing quality (minQual parameter). I tried importing the fastq.gz file starting Singularity MetONTIIME image with the command:

echo -e sample-id"\t"absolute-filepath > manifest.txt
fq=./fin_libB2_fly_medaka_consensus.fastq.gz
s=$(echo $(basename $fq) | sed 's/\.fastq\.gz//g');
echo -e $s"\t"$fq >> manifest.txt

singularity run --bind /mnt metontiime_latest.sif /bin/bash

/opt/conda/envs/MetONTIIME_env/bin/qiime tools import --type 'SampleData[SequencesWithQuality]' --input-path ./manifest.txt --input-format 'SingleEndFastqManifestPhred33V2' --output-path ./sequences.qza

And got: Imported ./manifest.txt as SingleEndFastqManifestPhred33V2 to ./sequences.qza So, there seem to be no issues in the file. SM

anilchauhanhp9 commented 5 months ago

The importDB step is working fine

Thank you