shizhuoxing / BGI-Full-Length-RNA-Analysis-Pipeline

Full-Length RNA Analysis pipeline developted by BGI RD group.
https://github.com/shizhuoxing/BGI-Full-Length-RNA-Analysis-Pipeline
17 stars 4 forks source link

HOW to do quantitative analysis #2

Closed hmyh1202 closed 4 years ago

hmyh1202 commented 4 years ago

Hi , AS in your design, you had add UMI tags in your library, so how to do quantitative analysis on isform level ? The Best !

shizhuoxing commented 4 years ago

hi, quantifying isoform level expression is just under development. the main process i can give here you can refer to: 1, first, map all isoforms to genome, using minimap2 now is popular; 2, then collapse isoforms by genome loci, such as using cDNA_cupcake; 3, base on collapse result, you can have script on you own to process the count result by UMI in each group (uniq isoform).

more discussion is welcome.

shizhuoxing commented 4 years ago

Generally, you can't removed sam UMI after CCS process, because difference isoforms may have sam UMI. I have posted above in step3, the UMI count value just representative of isoforms level expression, if you want to get the gene level expression value, here you may can have additional Step 4: 1... 2... 3...

  1. compare each group gff to reference gff using gffcompare, than parse the *.tmap file and sum it up each gene expression value base on Step 3.
shizhuoxing commented 4 years ago

Step 3 i have writed that "...count result by UMI in each group..."