Shamir-Lab / SCAPP

SCAPP is a plasmid assembly tool. This tool is described in our paper: https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-021-01068-z
MIT License
29 stars 6 forks source link

pipeline #39

Closed cabraham03 closed 8 months ago

cabraham03 commented 8 months ago

Hi, sorry for the question, I'm trying to make my first analyses with metagenomics shotgun, and I want to try SCAPP to detect plasmids. I just made an assembly with megahit and then generate the fastg file, I just was wonder, in this case how can I generate the bam file using metagenomics, or if you have any pipeline that you can share to follow it:

This is mine; firts I made the trim in fastq files and then generate the assembly: megahit -1 File_R1_trim.fastq -2 File_R2_trim.fastq --min-contig-len 1000 -m 0.8 -t 2 -o megahit

To create the fastg file, I use the last intermediate contigs (k119)

megahit_toolkit contig2fastg 99 intermediate_contigs/k119.contigs.fa > k119.contigs.fastg

my problem is, I want to generate the bam file, and I was looking for pipeline to generate the index for bwa, with genomics is easy for me, I just take a reference genome, but in this case with shotgun I'm lost !!!

and when I generate the index this is mi possible pipeline, just let me know if I'm ok with the next code:

bwa mem -t 80 bwa_index File_R1_trim.fastq File_R2_trim.fastq | samtools view -bS - > bwa.bam

any advice thanks

dpellow commented 8 months ago

If you pass in the reads SCAPP will do the alignment for you. Is that what you are asking about?

cabraham03 commented 8 months ago

I'm just confused. First, i just want to know how to generate the appropriated fastg file, could I use a fastg generated with megahit or any other program ??? and then run SCAPP something like:

scapp -g k119.contigs.fastg -o Results_SCAPP -r1 file_R1.fastq -r2 file_R1.fastq

is that ok ? or how generate the fastg appropriated file ??? Thanks

dpellow commented 8 months ago

That command looks fine in general, I would note the following issues: When you generate the fastg you need to use the correct value of k. Assuming k is 119 since you are using k119.contigs.fa, the contigs2fastg command should be: megahit_toolkit contig2fastg 119 intermediate_contigs/k119.contigs.fa > k119.contigs.fastg i.e. you need to use 119 as the value of k not 99. SCAPP needs to know the value of the maximum k, so the command would be: scapp -g k119.contigs.fastg -o Results_SCAPP -r1 file_R1.fastq -r2 file_R2.fastq -k 119 (also note that in the command you gave above you used file_R1.fastq twice instead of -r2 file_R2.fastq).

cabraham03 commented 8 months ago

thanks so much, it was an error in R1 and R2 sorry !! I will try how you describe for k 119 !!! just one more question, if a generate the BAM file, using bwa and samtools, could be used instead of r1 and r2, that's right ??? thanks for all !!!!

dpellow commented 8 months ago

That is good, you will have to regenerate the fastg using 119 and add -k 119 as an option to SCAPP. If you already have a BAM file mapping the reads to the contigs you can use that using -b instead of r1 and r2, but I suggest using the reads and letting SCAPP generate the bam file.