sdparekh / zUMIs

zUMIs: A fast and flexible pipeline to process RNA sequencing data with UMIs
GNU General Public License v3.0
275 stars 68 forks source link

Sandberg lab zUMIs YAML included find_pattern for UMI? #280

Closed roxyisat-rex closed 3 years ago

roxyisat-rex commented 3 years ago

Hi Was using zUMIs as a part of the sandberg lab protocol for analysing SS3 data. However I realised the directions for setting up the .yaml file is quite different from them and you guys. Here in your wiki on setting up the yaml, there is no mention of needing a find_pattern for a 11bp tag sequence, while on their main readme page, they said the yaml should be set up like the below with find_pattern filled in with a 11bp tag sequence. I am a little confused as to which to follow? Are there any resources that explains this in a bit more detail? Thanks!

file1:
    name: /mnt/storage2/temp_workdir/Undetermined_S0_L003_R1_001.fastq.gz
    base_definition:
      - cDNA(23-150)
      - UMI(12-19)
    find_pattern: ATTGCGCAATG
cziegenhain commented 3 years ago

Hi,

zUMIs is a general pipeline and Smart-seq3 is of course just one specific use case. You need to use the find_pattern to process Smart-seq3 data. There is some description for various scRNA-seq protocol types in our zUMIs wiki: https://github.com/sdparekh/zUMIs/wiki/Protocol-specific-setup

roxyisat-rex commented 3 years ago

Hi,

zUMIs is a general pipeline and Smart-seq3 is of course just one specific use case. You need to use the find_pattern to process Smart-seq3 data. There is some description for various scRNA-seq protocol types in our zUMIs wiki: https://github.com/sdparekh/zUMIs/wiki/Protocol-specific-setup

Thank you! I didn't see that previously! Much appreciated!