JiekaiLab / scTE

MIT License
87 stars 27 forks source link

STAR solo include the read 'CR:Z' or 'UR:Z' tags #12

Closed JihedC closed 3 years ago

JihedC commented 3 years ago

Hi thank you for the nice pipeline, I liked your article a lot!

Could you share with me an example of Star solo mapping of sc-atac-seq data to have the 'CR:Z' or 'UR:Z' tags in the bam file?

I have tried the following on the 10Kpbmc sc-atac-seq data (10X example dataset) with STAR 2.7.8a:

STAR  --genomeDir $genomedir
      --readFilesIn atac_pbmc_10k_v1_S1_L001_R3_001.fastq.gz,atac_pbmc_10k_v1_S1_L002_R3_001.fastq.gz \
      atac_pbmc_10k_v1_S1_L001_R1_001.fastq.gz,/atac_pbmc_10k_v1_S1_L002_R1_001.fastq.gz\
      --runRNGseed 42 --runThreadN 12 --readFilesCommand zcat \
--outFilterMultimapNmax 100 --winAnchorMultimapNmax 100 --outSAMmultNmax 1 --outSAMtype BAM SortedByCoordinate --twopassMode Basic --outWigType wiggle --outWigNorm RPM\
      --soloType CB_UMI_Simple \
      --soloCBwhitelist 737K-august-2016.txt \
      --soloBarcodeReadLength 0

This is what I could understand from the star solo documentation but it's wrong because the bam file has empty values for 'CR:Z' or 'UR:Z'.

samtools view Aligned.sortedByCoord.out.bam | head -1
A00519:269:H7FM2DRXX:2:2137:17978:8860  0   chr1    3004633 255 1S48M   *   0   0   GCCTAGAATATTATGCCCAACAAAACTATCTTTCAGAAATGAAGGAGAA   FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFF   NH:i:1  HI:i:1  AS:i:33 nM:i:7

Thanks in advance!

Jihed

jphe commented 3 years ago

For scATAC-seq data, scTE dose not support STARsolo output, you need to put the barcode into the readname and then map the reads by bowtie2

The barcode was inserted into the read name, so that the mapping could keep track of the cell ID. This yielded read names inside the FASTQ, such as: (where CCACGTTGTGGACTGA sequence is the cell barcode).
@CCACGTTGTGGACTGA:A00519:269:H7FM2DRXX:1:1101:1325:1000 1: N:0:AAGCATAA.
JihedC commented 3 years ago

Okay thanks for the quick reply! Sorry about the confusion!

How do you add the barcode in the read name ?

jphe commented 3 years ago

You need a custom script, I don't have the code in my PC.