kvittingseerup / IsoformSwitchAnalyzeR

An R package to Identify, Annoatate and Visialize Isoform Switches with Functional Consequences (from RNA-seq data)
101 stars 18 forks source link

Consequences missing from `extractConsequenceSummary` #78

Closed lvclark closed 3 years ago

lvclark commented 4 years ago

This is not urgent, but something I noticed that you might want to be aware of. I ran extractConsequenceSummary with returnResults = TRUE and it reported zero genes or isoforms for "Intron structure" and "ORF genomic". However, this is in contradiction with what I see in aSwitchList$switchConsequence[which(aSwitchList$switchConsequence$isoformsDifferent),].

Moreover if I run

constab <- aSwitchList.all$switchConsequence[which(aSwitchList.all$switchConsequence$isoformsDifferent),]
cons_genes2 <- t(tapply(constab$gene_id, list(constab$condition_1, constab$featureCompared), function(x) length(unique(x))))

I get slightly different numbers from

cons_summ <- extractConsequenceSummary(aSwitchList.all, plot = FALSE, returnResult = TRUE)
cons_genes <- t(tapply(cons_summ$nrGenesWithConsequences, list(cons_summ$Comparison, cons_summ$featureCompared), sum))

(Condition1 is unique to each comparison in my case.)

Maybe some genes have changes in both directions?

kvittingseerup commented 3 years ago

I'm really sorry this has taken me so long to get back to. For some reason, I missed it. Sorry about that.

You see a difference because a gene can contain multiple isoform switches, which can give rise to different consequences. So the reason you get a higher number of genes when you sum up the extractConsequenceSummary() data is simply that you double-count a few genes which are present in multiple categories.

Thanks for pointing it out.