Closed AGordonRobertson closed 2 years ago
Hi @AGordonRobertson,
thanks for reporting this -- it seems nobody ever tried the totalcells
option in immunedeconv before.
It certainly isn't implemented correctly in immunedeconv. Also this functionality requires an additional input file as shown in the original quantiseq documentation.
If you really need that feature I have to refer you to the original quantiseq pipeline for now.
@federicomarini is working on an improved R version of quanTIseq. @federicomarini: will it support the totalcells feature?
Hi @AGordonRobertson,
this is definitely a bug that @federicomarini and I are fixing in the new quanTIseq code. We will post an update soon.
Sorry for the inconvenience, Francesca
Thanks. Practically, it's extremely helpful to have different methods available within 'immunedeconv'.
'btotalcells' is described in help docs for deconvolute_quantiseq_default - "btotalcells = compute cell densities instead of fractions Default: FALSE"
I'm uncertain whether I need 'btotalcells'. I may not need it. But I'm uncertain about the 'units' of the results that different methods give -- so thought I'd try 'btotalcells = TRUE', and would compare its outputs across a set of expression subtypes, where I have purity values across the subtypes from a different team. I've made this comparison for other deconvolution methods within immunedeconv.
If it seems practical to you to fix 'btotalcells', please fix it. An less costly alternative would be to NOT fix it... I do NOT know that I need it.
In the current implementation here in immunedeconv
it is looking for a file which is not there - but it is also containing information regarding the original images where you could compute the cell densities.
@FFinotello and I can chip in here once we have that running for quantiseqr
😉
Hello @AGordonRobertson and apologies for my late reply.
We are working on an improved version of the quanTIseq R code that should be ready soon and will, of course, also fix this bug.
Regarding your doubts about the "totalcells" argument, it was designed to pass to quanTIseq information on total cell densities estimated from images of tumor tissue-slides (e.g. H&E). This information is needed only if you want to scale the cell fractions (i.e. referred to total cells in a sample) to cell densities (i.e. cell counts per area as in pathology images), as explained in Fig. 1a from the original quanTIseq paper.
If you want to quantify the cellular composition of a sample using different methods applied only to transcriptomics data, you do not need (and have) cell densities. So, I would suggest keeping the default settings, regardless of the current bug. In this way, you will get cell fractions that you can compare with the cell fractions or scores from the other methods.
Nevertheless, please remember that the outputs from the various methods can be by definition very different, and only EPIC and quanTIseq provide cell fractions referred to the overall cellular content of a sample (see also Table 1 from immunedeconv paper).
I hope this helps and please feel free to reach out again if you have any doubts.
Best, Francesca
Thank you. It's very helpful to have your comments on the outputs of different methods. I'll use the current version and fractions.
@federicomarini, is quantiseqR ready and should we switch to it in immunedeconv?
I think so, quantiseqr
is pretty much ready for prime time.
Especially this third point is something that can be worth having immunedeconv
-wise, and so for the next projects. I think there's a nice advantage in handling the metadata in integrated containers that even take care of the matches, subsettings, and so on.
@FFinotello - (y)our call 😉
@federicomarini, expressionSets are already supported: https://github.com/icbi-lab/immunedeconv/blob/c70539f2b08901687561dca755337fc6a5130440/R/immune_deconvolution_methods.R#L331-L333
I've never worked with SummarizedExperiments, but I guess it should be trivial to support them.
Hello!
I agree that we are pretty ready.
Maybe an idea would be to systematically analyze our set of validation datasets with quantiseqR and immunedeconv-quanTIseq and check if we find any inconsistencies. Whould that be easily doable for you, Federico?
Cheers, Francesca
On Sun, 7 Nov 2021, 12:07 Federico Marini, @.***> wrote:
I think so, quantiseqr is pretty much ready for prime time.
- it's on Bioconductor
- has already a couple of speedup optimizations
- can handle matrix, eSets and also SummarizedExperiment objects
Especially this third point is something that can be worth having immunedeconv-wise, and so for the next projects. I think there's a nice advantage in handling the metadata in integrated containers that even take care of the matches, subsettings, and so on.
@FFinotello https://github.com/FFinotello - (y)our call 😉
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-962590413, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF6C62V7O4OM4UZONQUTEV3UKZMVJANCNFSM42RIWNNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Thank you for keeping me informed!
Gordon
On Nov 7, 2021, at 09:33, FFinotello @.**@.>> wrote:
Hello!
I agree that we are pretty ready.
Maybe an idea would be to systematically analyze our set of validation datasets with quantiseqR and immunedeconv-quanTIseq and check if we find any inconsistencies. Whould that be easily doable for you, Federico?
Cheers, Francesca
On Sun, 7 Nov 2021, 12:07 Federico Marini, @.***> wrote:
I think so, quantiseqr is pretty much ready for prime time.
- it's on Bioconductor
- has already a couple of speedup optimizations
- can handle matrix, eSets and also SummarizedExperiment objects
Especially this third point is something that can be worth having immunedeconv-wise, and so for the next projects. I think there's a nice advantage in handling the metadata in integrated containers that even take care of the matches, subsettings, and so on.
@FFinotello https://github.com/FFinotello - (y)our call 😉
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-962590413, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF6C62V7O4OM4UZONQUTEV3UKZMVJANCNFSM42RIWNNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-962650637, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABT6GSQPR3DBLZZ3SU6KDUTUK2Z5RANCNFSM42RIWNNQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
All,
Quantiseqr’s v1.2.0 vignette says “you should use TPMs”.
Gordon
On Nov 7, 2021, at 09:33, FFinotello @.**@.>> wrote:
Hello!
I agree that we are pretty ready.
Maybe an idea would be to systematically analyze our set of validation datasets with quantiseqR and immunedeconv-quanTIseq and check if we find any inconsistencies. Whould that be easily doable for you, Federico?
Cheers, Francesca
On Sun, 7 Nov 2021, 12:07 Federico Marini, @.***> wrote:
I think so, quantiseqr is pretty much ready for prime time.
- it's on Bioconductor
- has already a couple of speedup optimizations
- can handle matrix, eSets and also SummarizedExperiment objects
Especially this third point is something that can be worth having immunedeconv-wise, and so for the next projects. I think there's a nice advantage in handling the metadata in integrated containers that even take care of the matches, subsettings, and so on.
@FFinotello https://github.com/FFinotello - (y)our call 😉
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-962590413, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF6C62V7O4OM4UZONQUTEV3UKZMVJANCNFSM42RIWNNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-962650637, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABT6GSQPR3DBLZZ3SU6KDUTUK2Z5RANCNFSM42RIWNNQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
I can not see the question in your message, @AGordonRobertson - looks like to me you are just reporting the part from the Q&A in the vignette?
Frederico
You’re right, I simply reported the “TPM” item from the FAQs from the quantiseqr’s vignette.
Questions might be: if quantiseqr will be included in immunedeconv -
Will this guidance be highlighted so that a user cannot easily miss it? if so, how will it be highlighted (e.g. would the vignette show how a user could transform FPKMs to TPMs)?
Which of the other deconvolution methods in immunedeconv should have TPMs as input, rather than some other RNA-seq unit (e.g. FPKM)?
Gordon
On Nov 8, 2021, at 07:20, Federico Marini @.**@.>> wrote:
I can not see the question in your message, @AGordonRobertsonhttps://github.com/AGordonRobertson - looks like to me you are just reporting the part from the Q&A in the vignette?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-963263880, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABT6GSTHOCJKZFRVLZH74FTUK7TDVANCNFSM42RIWNNQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
immundeconv
- I am "only" a contributor of the quanTIseq porting 😉 I agree this should be documented somewhere - in the end, we want the users to properly use the package! Wild guess: if you use otherwise normalized data, I don't think you'd be too much far away from the expected results, but you'd be using something in a sub-optimal way. And in full honesty, if one can, one should avoid it +1
This is documented here but it keeps coming up. Where do you think we should add it to make it more visible? We were thinking about an FAQ section at some point: https://github.com/icbi-lab/immunedeconv/issues/61
I’m on a telecon now, will respond when free. Gordon
On Nov 8, 2021, at 08:07, Gregor Sturm @.**@.>> wrote:
I agree this should be documented somewhere - in the end, we want the users to properly use the package! Wild guess: if you use otherwise normalized data, I don't think you'd be too much far away from the expected results, but you'd be using something in a sub-optimal way. And in full honesty, if one can, one should avoid it +1
This is documented herehttps://icbi-lab.github.io/immunedeconv/articles/immunedeconv.html#input-data but it keeps coming up. Where do you think we should add it to make it more visible? We were thinking about an FAQ section at some point: #61https://github.com/icbi-lab/immunedeconv/issues/61
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-963308202, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABT6GSWRBWERH2H2WIDNA3DUK7YSNANCNFSM42RIWNNQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
I agree this should be documented somewhere - in the end, we want the users to properly use the package! Wild guess: if you use otherwise normalized data, I don't think you'd be too much far away from the expected results, but you'd be using something in a sub-optimal way. And in full honesty, if one can, one should avoid it +1
This is documented here but it keeps coming up. Where do you think we should add it to make it more visible? We were thinking about an FAQ section at some point: #61
I could argue it is well visible already, but maybe we could
Details
section for thatI mean, in the end it is up to the user to read the fantastic manual 😬
The 'Getting started with immunedeconv’ documentation is good!
The DOI seems correct, and the publication web page says that the details should be: Cancer Immunology, Immunotherapyhttps://link.springer.com/journal/262 volume 67, pages 1031–1040.
It might be helpful to tell a reader how to transform RSEM values, FPKMs, or read counts into TPMs. What I’m hoping for is something very simple and clear. E.g. this gives substantial (too much!?) detail. https://haroldpimentel.wordpress.com/2014/05/08/what-the-fpkm-a-review-rna-seq-expression-units/
It says: "If you have FPKM, you can easily compute TPM: ..."
Gordon
On Nov 8, 2021, at 08:41, Federico Marini @.**@.>> wrote:
I agree this should be documented somewhere - in the end, we want the users to properly use the package! Wild guess: if you use otherwise normalized data, I don't think you'd be too much far away from the expected results, but you'd be using something in a sub-optimal way. And in full honesty, if one can, one should avoid it +1
This is documented herehttps://icbi-lab.github.io/immunedeconv/articles/immunedeconv.html#input-data but it keeps coming up. Where do you think we should add it to make it more visible? We were thinking about an FAQ section at some point: #61https://github.com/icbi-lab/immunedeconv/issues/61
I could argue it is well visible already, but maybe we could
I mean, in the end it is up to the user to read the fantastic manual 😬
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-963349833, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABT6GSQH3H362A5SCI72MCDUK74TPANCNFSM42RIWNNQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
At the bottom of the article at: https://haroldpimentel.wordpress.com/2014/05/08/what-the-fpkm-a-review-rna-seq-expression-units/
"I’ve included some R code below for computing effective counts, TPM, and FPKM…. "
countToTpm <- function(counts, effLen) { rate <- log(counts) - log(effLen) denom <- log(sum(exp(rate))) exp(rate - denom + log(1e6)) }
countToFpkm <- function(counts, effLen) { N <- sum(counts) exp( log(counts) + log(1e9) - log(effLen) - log(N) ) }
fpkmToTpm <- function(fpkm) { exp(log(fpkm) - log(sum(fpkm)) + log(1e6)) }
countToEffCounts <- function(counts, len, effLen) { counts * (len / effLen) }
################################################################################
################################################################################ cnts <- c(4250, 3300, 200, 1750, 50, 0) lens <- c(900, 1020, 2000, 770, 3000, 1777) countDf <- data.frame(count = cnts, length = lens) countDf
countDf$effLength <- countDf$length - 203.7 + 1
countDf$tpm <- with(countDf, countToTpm(count, effLength))
countDf$fpkm <- with(countDf, countToFpkm(count, effLength))
with(countDf, all.equal(tpm, fpkmToTpm(fpkm)))
countDf$effCounts <- with(countDf, countToEffCounts(count, length, effLength))
countDf
plot(countDf$fpkm, countDf$tpm)
I do not understand what this is -
countDf$effLength <- countDf$length - 203.7 + 1
Gordon
On Nov 8, 2021, at 09:45, Gordon Robertson @.***> wrote:
The 'Getting started with immunedeconv’ documentation is good!
The DOI seems correct, and the publication web page says that the details should be: Cancer Immunology, Immunotherapyhttps://link.springer.com/journal/262 volume 67, pages 1031–1040.
It might be helpful to tell a reader how to transform RSEM values, FPKMs, or read counts into TPMs. What I’m hoping for is something very simple and clear. E.g. this gives substantial (too much!?) detail. https://haroldpimentel.wordpress.com/2014/05/08/what-the-fpkm-a-review-rna-seq-expression-units/
It says: "If you have FPKM, you can easily compute TPM: ..."
Gordon
On Nov 8, 2021, at 08:41, Federico Marini @.**@.>> wrote:
I agree this should be documented somewhere - in the end, we want the users to properly use the package! Wild guess: if you use otherwise normalized data, I don't think you'd be too much far away from the expected results, but you'd be using something in a sub-optimal way. And in full honesty, if one can, one should avoid it +1
This is documented herehttps://icbi-lab.github.io/immunedeconv/articles/immunedeconv.html#input-data but it keeps coming up. Where do you think we should add it to make it more visible? We were thinking about an FAQ section at some point: #61https://github.com/icbi-lab/immunedeconv/issues/61
I could argue it is well visible already, but maybe we could
I mean, in the end it is up to the user to read the fantastic manual 😬
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-963349833, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABT6GSQH3H362A5SCI72MCDUK74TPANCNFSM42RIWNNQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
I am guessing, but the FLD is the fragment length distribution? And that would be something you'd subtract from the length to "make it the effective length". Apart from that: it can be that in some samples the length of the transcript might not really be the same. If using TPM quantifications from salmon or kallisto (as recommended, basically through the original concept of quanTIseq as a whole pipeline), then this aspect is already taken care of.
Frederico,
FLD is very likely 'fragment length distribution'.
From what I recall, a mean fragment length of ~200 bp was considered appropriate for Illumina sequencers.
I don’t know why we’d subtract this from length, then add one, to get an 'effective length'. Possibly this corrects for read coverage decreasing near the 5’ and 3’ ends of a gene.
countDf$effLength <- countDf$length - 203.7 + 1
Gordon
On Nov 8, 2021, at 13:22, Federico Marini @.**@.>> wrote:
I am guessing, but the FLD is the fragment length distribution? And that would be something you'd subtract from the length to "make it the effective length". Apart from that: it can be that in some samples the length of the transcript might not really be the same. If using TPM quantifications from salmon or kallisto (as recommended, basically through the original concept of quanTIseq as a whole pipeline), then this aspect is already taken care of.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-963586732, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABT6GSRJUIGGY2JMB5RUQI3ULA5SDANCNFSM42RIWNNQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
I'd suggest you can have a look at this very informative post by @rob-p for an explanation on the effective length 😉
Thanks! This is informative. G
On Nov 8, 2021, at 13:35, Federico Marini @.**@.>> wrote:
I'd suggest you can have a look at this very informative post by @rob-phttps://github.com/rob-p for an explanation on the effective length 😉
http://robpatro.com/blog/?p=235#efflen
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-963596536, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABT6GSWWCCLCTQ55CNSLLF3ULA7DBANCNFSM42RIWNNQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
@federicomarini, can this be closed?
Guess so - ended up spacing across a couple of topics in the end, but seems done to me
I just ran immundeconv v2.0.3 on R v4.0.3 on RNAseq data for ~20k genes and ~250 tumor samples. (macOS 10.15.7)
MCP-counter gave no problems. mcp_counter_res <- deconvolute_mcp_counter( as.matrix(merged.gencode.RNAseq.unique_gene_names), feature_types = "HUGO_symbols" ) dim(mcp_counter_res) 10 256
quanTiseq gave no problems if btotalcells = default = FALSE. quanTiseq_res <- deconvolute_quantiseq( as.matrix(merged.gencode.RNAseq.unique_gene_names), tumor = TRUE, arrays = FALSE, scale_mrna = TRUE ) Running quanTIseq deconvolution module Gene expression normalization and re-annotation (arrays: FALSE) Removing 17 noisy genes Removing 15 genes with high expression in tumors Signature genes found in data set: 137/138 (99.28%) Mixture deconvolution (method: lsei) Deconvolution sucessful!
But if I set btotalcells = TRUE, I get an error. quanTiseq_res <- deconvolute_quantiseq( as.matrix(merged.gencode.RNAseq.unique_gene_names), tumor = TRUE, arrays = FALSE, scale_mrna = TRUE, btotalcells = TRUE ) Running quanTIseq deconvolution module Gene expression normalization and re-annotation (arrays: FALSE) Removing 17 noisy genes Removing 15 genes with high expression in tumors Signature genes found in data set: 137/138 (99.28%) Mixture deconvolution (method: lsei) Deconvolution sucessful! Error in system.file("extdata", "quantiseq", "totalcells.txt", package = "immunedeconv", : no file found
Am I doing something silly?