sdparekh / zUMIs

zUMIs: A fast and flexible pipeline to process RNA sequencing data with UMIs
GNU General Public License v3.0
268 stars 67 forks source link

Processing RNA-Seq data with UMIs from SMARTer® Stranded Total RNA-Seq Kit v3 - Pico Input Mammalian #376

Closed seifudd closed 8 months ago

seifudd commented 9 months ago

Hi, I need some help defining the inputs for processing plate-based RNA-Seq data (with UMIs) generated using the SMARTer® Stranded Total RNA-Seq Kit v3 - Pico Input Mammalian from Takara.

Description of sequencing protocol is here.

I have R1 and R2 per sample.

Here is the YAML file that I am using for running zUMIs for 1 sample - BMS073_retry4.zUMIs_config_formated.txt

Here is the barcode file - barcode_BMS073.txt

After running zUMIs, there are no Expression files generated.

The header of the fastq files (both R1 and R2) contains the barcode (included in the barcode file above), for e.g.

@lh00172:46:GW23091633rd:5:1101:3145:1064 2:N:0:ACTAAGAT+CCGCGGTT TTTGATCACCCGGGCACCTAGAGTGCGGAGTTGAGCACCTCAGAAGGGAGGTGTGGCCCTCAGGACATTGTCAGCATGTGCGATGATAGTGGAGGCCATGGATTTGAGAACATTTCAGAGAATATTTGATGGAGGGGAACGCCAAGGGGA + FFFFFFFFFFFFFFFFF-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF5FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF5FFF5F-FFFFF-

What am I missing? Any help will be appreciated.

Thanks, Fayaz

cziegenhain commented 9 months ago

Hi Fayaz,

zUMIs does not support barcodes stored in headers and running already demultiplexed data. It is mandatory to have a BC field in the config file If you dont have access to the data prior to demultiplexing, I suggest to check here: https://github.com/sdparekh/zUMIs/wiki/Starting-from-demultiplexed-fastq-files

seifudd commented 8 months ago

Thank you. This makes sense. Appreciate your response and help with this.