sjdlabgroup / SAHMI

Other
68 stars 12 forks source link

No paired FASTA for Single-cell k-mer analysis #14

Closed ZhangDengwei closed 1 year ago

ZhangDengwei commented 1 year ago

Hi there,

Thanks for developing this amazing tool! I tried to apply it to paired-end single-cell sequencing reads, following the tutorial with the commands below

Rscript /data3/zhangdw/01.Toolkit/SAHMI/functions/run_kraken.r --sample SRR8088290 --fq1 ../../../02.FASTQ/scRNA_GSE121638/SRR8088290/SRR8088290_S1_L001_R1_001.fastq --fq2 ../../../02.FASTQ/scRNA_GSE121638/SRR8088290/SRR8088290_S1_L001_R2_001.fastq --out_path out. --Kraken2Uniq_path /data/zhangdw/01.Toolkit/anaconda3/envs/py37/bin/kraken2 --kraken_database_path /data3/zhangdw/02.Database/01.Genome/10.Kranken2/03.Default_plus_human/kdb_human/ --kreport2mpa_path /data3/zhangdw/01.Toolkit/SAHMI/functions/kreport2mpa.py --paired T
Rscript /data3/zhangdw/01.Toolkit/SAHMI/functions/extract_microbiome_reads.r --sample_name SRR8088290 --fq out.SRR8088290_1.fq --fq out.SRR8088290_2.fq --kraken_report out.SRR8088290.kraken.report.txt --mpa_report out.SRR8088290.kraken.report.mpa.txt --out_path ./
Rscript /data3/zhangdw/01.Toolkit/SAHMI/functions/extract_microbiome_output.r --sample_name SRR8088290 --output_file out.SRR8088290.kraken.output.txt --kraken_report out.SRR8088290.kraken.report.txt --mpa_report out.SRR8088290.kraken.report.mpa.txt --out_path ./ 

Prior to running k-mer analysis, generated files include

-rw-rw-r--. 1 dwzhang dwzhang    736720185 Oct 16 20:03 SRR8088290.fa
-rw-rw-r--. 1 dwzhang dwzhang    506639410 Oct 16 20:10 SRR8088290.microbiome.output.txt
-rw-rw-r--. 1 dwzhang dwzhang  39468201318 Oct 16 19:35 out.SRR8088290.kraken.output.txt
-rw-rw-r--. 1 dwzhang dwzhang      1747721 Oct 16 19:35 out.SRR8088290.kraken.report.mpa.txt
-rw-rw-r--. 1 dwzhang dwzhang       618413 Oct 16 19:35 out.SRR8088290.kraken.report.std.txt
-rw-rw-r--. 1 dwzhang dwzhang       695614 Oct 16 19:35 out.SRR8088290.kraken.report.txt
-rw-rw-r--. 1 dwzhang dwzhang 132120472618 Oct 16 19:35 out.SRR8088290_1.fq
-rw-rw-r--. 1 dwzhang dwzhang 132120472618 Oct 16 19:35 out.SRR8088290_2.fq

I wonder why only one SRR8088290.fa was generated from the paired reads, which only include reads from out.SRR8088290_2.fq but not out.SRR8088290_1.fq. Also, sckmer.r can take in two paired fasta files. Is there anything wrong regarding my codes or my understanding of the tutorial?

Any suggestion would be greatly appreciated!

liyongzheng1 commented 10 months ago

I also encountered the same problem, how did you solve it? @ZhangDengwei

ZhangDengwei commented 10 months ago

Run individually for read 1 and read 2.

Rscript extract_microbiome_reads.r --sample_name ${q}_1 --fq out.${q}_1.fq --kraken_report out.${q}.kraken.report.txt --mpa_report out.${q}.kraken.report.mpa.txt --out_path ./
Rscript extract_microbiome_reads.r --sample_name ${q}_2 --fq out.${q}_2.fq --kraken_report out.${q}.kraken.report.txt --mpa_report out.${q}.kraken.report.mpa.txt --out_path ./
liyongzheng1 commented 10 months ago

OK, I misunderstood that I need to repeat the -fq parameter twice, Thank you!