gpertea / stringtie

Transcript assembly and quantification for RNA-Seq
MIT License
365 stars 76 forks source link

FPKM filter in stringtie --merge #365

Open pasviber opened 2 years ago

pasviber commented 2 years ago

Hi,

I don't understand the FPKM filter used by stringtie --merge (version 2) to remove transcripts with low abundance. I got an individual assembly for each sample (28 samples) using the known transcripts as reference and I merged them setting -F 0.3 and -T 0. The result was a merged assembly with 73668 transcripts. Then, I got an individual assembly for each sample again using the merged assembly gtf file as reference and I built a matrix but I only had information about 73546 transcripts. So, I had more transcripts in the merged assembly than if I selected transcript ids through the individual assembly gtf files. Moreover, If I filter the matrix coming from the individual assemblies by fpkm >= 0.3 at least in one sample, I only keep 60612 transcripts.

maybe I didn't understand the stringtie --merge algorithm. Could someone explain it to me?

Thanks