PierreBSC / Viral-Track

MIT License
54 stars 27 forks source link

Multiple paired-end files as input #3

Closed zji90 closed 4 years ago

zji90 commented 4 years ago

Hi, The example 'Files_to_process.txt' only includes one file: /home/pbost/Desktop/Viral_Track/Data_COVID/C147_S1_L001_R2_001.fastq.gz If I have multiple paired-end reads from multiple lanes (which is typical for 10x experiments), such as: sample_S1_L001_R1_001.fastq.gz sample_S1_L001_R2_001.fastq.gz sample_S2_L001_R1_001.fastq.gz sample_S2_L001_R2_001.fastq.gz

How should I specify Files_to_process.txt? Thank you!

PierreBSC commented 4 years ago

Hi Zhicheng,

To process 10X data you only have to put the R2 files as the R1 file only contains the barcode. However if you wish to perform single-cell demultiplexing you have to label the reads using umi_tools whitelist and extract commands. This is described extensively on this webpage : https://github.com/CGATOxford/UMI-tools/blob/master/doc/Single_cell_tutorial.md

So here for the Files_to_process.txt should look like : sample_S1_L001_R2_001.fastq.gz sample_S2_L001_R2_001.fastq.gz Hope this will help you !

Best

Pierre

zji90 commented 4 years ago

Thank you!