AGordonRobertson commented 3 years ago

I just ran immundeconv v2.0.3 on R v4.0.3 on RNAseq data for ~20k genes and ~250 tumor samples. (macOS 10.15.7)

MCP-counter gave no problems. mcp_counter_res <- deconvolute_mcp_counter( as.matrix(merged.gencode.RNAseq.unique_gene_names), feature_types = "HUGO_symbols" ) dim(mcp_counter_res) 10 256
quanTiseq gave no problems if btotalcells = default = FALSE. quanTiseq_res <- deconvolute_quantiseq( as.matrix(merged.gencode.RNAseq.unique_gene_names), tumor = TRUE, arrays = FALSE, scale_mrna = TRUE ) Running quanTIseq deconvolution module Gene expression normalization and re-annotation (arrays: FALSE) Removing 17 noisy genes Removing 15 genes with high expression in tumors Signature genes found in data set: 137/138 (99.28%) Mixture deconvolution (method: lsei) Deconvolution sucessful!
But if I set btotalcells = TRUE, I get an error. quanTiseq_res <- deconvolute_quantiseq( as.matrix(merged.gencode.RNAseq.unique_gene_names), tumor = TRUE, arrays = FALSE, scale_mrna = TRUE, btotalcells = TRUE ) Running quanTIseq deconvolution module Gene expression normalization and re-annotation (arrays: FALSE) Removing 17 noisy genes Removing 15 genes with high expression in tumors Signature genes found in data set: 137/138 (99.28%) Mixture deconvolution (method: lsei) Deconvolution sucessful! Error in system.file("extdata", "quantiseq", "totalcells.txt", package = "immunedeconv", : no file found

Am I doing something silly?

grst commented 3 years ago

Hi @AGordonRobertson,

thanks for reporting this -- it seems nobody ever tried the totalcells option in immunedeconv before.

It certainly isn't implemented correctly in immunedeconv. Also this functionality requires an additional input file as shown in the original quantiseq documentation.

If you really need that feature I have to refer you to the original quantiseq pipeline for now.

@federicomarini is working on an improved R version of quanTIseq. @federicomarini: will it support the totalcells feature?

FFinotello commented 3 years ago

Hi @AGordonRobertson,

this is definitely a bug that @federicomarini and I are fixing in the new quanTIseq code. We will post an update soon.

Sorry for the inconvenience, Francesca

AGordonRobertson commented 3 years ago

Thanks. Practically, it's extremely helpful to have different methods available within 'immunedeconv'.

'btotalcells' is described in help docs for deconvolute_quantiseq_default - "btotalcells = compute cell densities instead of fractions Default: FALSE"

I'm uncertain whether I need 'btotalcells'. I may not need it. But I'm uncertain about the 'units' of the results that different methods give -- so thought I'd try 'btotalcells = TRUE', and would compare its outputs across a set of expression subtypes, where I have purity values across the subtypes from a different team. I've made this comparison for other deconvolution methods within immunedeconv.

AGordonRobertson commented 3 years ago

If it seems practical to you to fix 'btotalcells', please fix it. An less costly alternative would be to NOT fix it... I do NOT know that I need it.

federicomarini commented 3 years ago

In the current implementation here in immunedeconv it is looking for a file which is not there - but it is also containing information regarding the original images where you could compute the cell densities. @FFinotello and I can chip in here once we have that running for quantiseqr 😉

FFinotello commented 3 years ago

Hello @AGordonRobertson and apologies for my late reply.

We are working on an improved version of the quanTIseq R code that should be ready soon and will, of course, also fix this bug.

Regarding your doubts about the "totalcells" argument, it was designed to pass to quanTIseq information on total cell densities estimated from images of tumor tissue-slides (e.g. H&E). This information is needed only if you want to scale the cell fractions (i.e. referred to total cells in a sample) to cell densities (i.e. cell counts per area as in pathology images), as explained in Fig. 1a from the original quanTIseq paper.

If you want to quantify the cellular composition of a sample using different methods applied only to transcriptomics data, you do not need (and have) cell densities. So, I would suggest keeping the default settings, regardless of the current bug. In this way, you will get cell fractions that you can compare with the cell fractions or scores from the other methods.

Nevertheless, please remember that the outputs from the various methods can be by definition very different, and only EPIC and quanTIseq provide cell fractions referred to the overall cellular content of a sample (see also Table 1 from immunedeconv paper).

I hope this helps and please feel free to reach out again if you have any doubts.

Best, Francesca

AGordonRobertson commented 3 years ago

Thank you. It's very helpful to have your comments on the outputs of different methods. I'll use the current version and fractions.

grst commented 2 years ago

@federicomarini, is quantiseqR ready and should we switch to it in immunedeconv?

federicomarini commented 2 years ago

I think so, quantiseqr is pretty much ready for prime time.

it's on Bioconductor
has already a couple of speedup optimizations
can handle matrix, eSets and also SummarizedExperiment objects

Especially this third point is something that can be worth having immunedeconv-wise, and so for the next projects. I think there's a nice advantage in handling the metadata in integrated containers that even take care of the matches, subsettings, and so on.

@FFinotello - (y)our call 😉

grst commented 2 years ago

@federicomarini, expressionSets are already supported: https://github.com/icbi-lab/immunedeconv/blob/c70539f2b08901687561dca755337fc6a5130440/R/immune_deconvolution_methods.R#L331-L333

I've never worked with SummarizedExperiments, but I guess it should be trivial to support them.

FFinotello commented 2 years ago

Hello!

I agree that we are pretty ready.

Maybe an idea would be to systematically analyze our set of validation datasets with quantiseqR and immunedeconv-quanTIseq and check if we find any inconsistencies. Whould that be easily doable for you, Federico?

Cheers, Francesca

On Sun, 7 Nov 2021, 12:07 Federico Marini, @.***> wrote:

I think so, quantiseqr is pretty much ready for prime time.

it's on Bioconductor

has already a couple of speedup optimizations

can handle matrix, eSets and also SummarizedExperiment objects

Especially this third point is something that can be worth having immunedeconv-wise, and so for the next projects. I think there's a nice advantage in handling the metadata in integrated containers that even take care of the matches, subsettings, and so on.

@FFinotello https://github.com/FFinotello - (y)our call 😉

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-962590413, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF6C62V7O4OM4UZONQUTEV3UKZMVJANCNFSM42RIWNNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

AGordonRobertson commented 2 years ago

Thank you for keeping me informed!

Gordon

On Nov 7, 2021, at 09:33, FFinotello @.**@.>> wrote:

Hello!

I agree that we are pretty ready.

Maybe an idea would be to systematically analyze our set of validation datasets with quantiseqR and immunedeconv-quanTIseq and check if we find any inconsistencies. Whould that be easily doable for you, Federico?

Cheers, Francesca

On Sun, 7 Nov 2021, 12:07 Federico Marini, @.***> wrote:

I think so, quantiseqr is pretty much ready for prime time.

it's on Bioconductor

has already a couple of speedup optimizations

can handle matrix, eSets and also SummarizedExperiment objects

Especially this third point is something that can be worth having immunedeconv-wise, and so for the next projects. I think there's a nice advantage in handling the metadata in integrated containers that even take care of the matches, subsettings, and so on.

@FFinotello https://github.com/FFinotello - (y)our call 😉

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-962590413, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF6C62V7O4OM4UZONQUTEV3UKZMVJANCNFSM42RIWNNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-962650637, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABT6GSQPR3DBLZZ3SU6KDUTUK2Z5RANCNFSM42RIWNNQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

AGordonRobertson commented 2 years ago

All,

Quantiseqr’s v1.2.0 vignette says “you should use TPMs”.

Q: Do I have to provide my expression data formatted as TPMs? Why is that so?

A: The expression data is indeed expected to be provided as TPM values. quantiseqr might warn you if you are providing a different format (counts, normalized counts) - this does not mean that it will trigger an error as the computation is still able to proceed.

Still: it is not the recommended way. If using a SummarizedExperiment object coming from Salmon’s quantifications, the tximeta/tximport pipeline will provide an assay named “abundance”, which would be handled internally by the se_to_matrix() function - you can simply call quantiseqr() and provide the SummarizedExperiment object as main parameter.

Gordon

On Nov 7, 2021, at 09:33, FFinotello @.**@.>> wrote:

Hello!

I agree that we are pretty ready.

Maybe an idea would be to systematically analyze our set of validation datasets with quantiseqR and immunedeconv-quanTIseq and check if we find any inconsistencies. Whould that be easily doable for you, Federico?

Cheers, Francesca

On Sun, 7 Nov 2021, 12:07 Federico Marini, @.***> wrote:

I think so, quantiseqr is pretty much ready for prime time.

it's on Bioconductor

has already a couple of speedup optimizations

can handle matrix, eSets and also SummarizedExperiment objects

Especially this third point is something that can be worth having immunedeconv-wise, and so for the next projects. I think there's a nice advantage in handling the metadata in integrated containers that even take care of the matches, subsettings, and so on.

@FFinotello https://github.com/FFinotello - (y)our call 😉

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-962590413, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF6C62V7O4OM4UZONQUTEV3UKZMVJANCNFSM42RIWNNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-962650637, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABT6GSQPR3DBLZZ3SU6KDUTUK2Z5RANCNFSM42RIWNNQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

federicomarini commented 2 years ago

I can not see the question in your message, @AGordonRobertson - looks like to me you are just reporting the part from the Q&A in the vignette?

AGordonRobertson commented 2 years ago

Frederico

You’re right, I simply reported the “TPM” item from the FAQs from the quantiseqr’s vignette.

Questions might be: if quantiseqr will be included in immunedeconv -

Will this guidance be highlighted so that a user cannot easily miss it? if so, how will it be highlighted (e.g. would the vignette show how a user could transform FPKMs to TPMs)?
Which of the other deconvolution methods in immunedeconv should have TPMs as input, rather than some other RNA-seq unit (e.g. FPKM)?

Gordon

On Nov 8, 2021, at 07:20, Federico Marini @.**@.>> wrote:

I can not see the question in your message, @AGordonRobertsonhttps://github.com/AGordonRobertson - looks like to me you are just reporting the part from the Q&A in the vignette?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-963263880, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABT6GSTHOCJKZFRVLZH74FTUK7TDVANCNFSM42RIWNNQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

federicomarini commented 2 years ago

Makes sense, but I'd argue it is more an aspect to curate in immundeconv - I am "only" a contributor of the quanTIseq porting 😉
I agree this should be documented somewhere - in the end, we want the users to properly use the package! Wild guess: if you use otherwise normalized data, I don't think you'd be too much far away from the expected results, but you'd be using something in a sub-optimal way. And in full honesty, if one can, one should avoid it 👍

grst commented 2 years ago

I agree this should be documented somewhere - in the end, we want the users to properly use the package! Wild guess: if you use otherwise normalized data, I don't think you'd be too much far away from the expected results, but you'd be using something in a sub-optimal way. And in full honesty, if one can, one should avoid it +1

This is documented here but it keeps coming up. Where do you think we should add it to make it more visible? We were thinking about an FAQ section at some point: https://github.com/icbi-lab/immunedeconv/issues/61

AGordonRobertson commented 2 years ago

I’m on a telecon now, will respond when free. Gordon

On Nov 8, 2021, at 08:07, Gregor Sturm @.**@.>> wrote:

I agree this should be documented somewhere - in the end, we want the users to properly use the package! Wild guess: if you use otherwise normalized data, I don't think you'd be too much far away from the expected results, but you'd be using something in a sub-optimal way. And in full honesty, if one can, one should avoid it +1

This is documented herehttps://icbi-lab.github.io/immunedeconv/articles/immunedeconv.html#input-data but it keeps coming up. Where do you think we should add it to make it more visible? We were thinking about an FAQ section at some point: #61https://github.com/icbi-lab/immunedeconv/issues/61

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-963308202, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABT6GSWRBWERH2H2WIDNA3DUK7YSNANCNFSM42RIWNNQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

federicomarini commented 2 years ago

I agree this should be documented somewhere - in the end, we want the users to properly use the package! Wild guess: if you use otherwise normalized data, I don't think you'd be too much far away from the expected results, but you'd be using something in a sub-optimal way. And in full honesty, if one can, one should avoid it +1

This is documented here but it keeps coming up. Where do you think we should add it to make it more visible? We were thinking about an FAQ section at some point: #61

I could argue it is well visible already, but maybe we could

point out somewhere in the vignette as well (Q&A sounds good)
in the function documentation of the wrapper, we can have a dedicated Details section for that

I mean, in the end it is up to the user to read the fantastic manual 😬

AGordonRobertson commented 2 years ago

The 'Getting started with immunedeconv’ documentation is good!

One small point: the Finotello et al 2018 reference is given as: Finotello, Francesca, and Zlatko Trajanoski. 2018. “Quantifying tumor-infiltrating immune cells from transcriptomics data.” Cancer Immunology, Immunotherapy 0. https://doi.org/10.1007/s00262-018-2150-z.

The DOI seems correct, and the publication web page says that the details should be: Cancer Immunology, Immunotherapyhttps://link.springer.com/journal/262 volume 67, pages 1031–1040.

The 'Run the deconvolution’ section already asks for TPMs - Input Data The input data is a gene × sample gene expression matrix. In general values should be
- TPM-normalized
- not log-transformed.

It might be helpful to tell a reader how to transform RSEM values, FPKMs, or read counts into TPMs. What I’m hoping for is something very simple and clear. E.g. this gives substantial (too much!?) detail. https://haroldpimentel.wordpress.com/2014/05/08/what-the-fpkm-a-review-rna-seq-expression-units/

It says: "If you have FPKM, you can easily compute TPM: ..."

In some runs, EPIC (https://github.com/GfellerLab/EPIC) reports that some samples 'did not converge'. It would be helpful if ‘epic’ in immunedeconv passed such warnings through to the user.

Gordon

On Nov 8, 2021, at 08:41, Federico Marini @.**@.>> wrote:

I agree this should be documented somewhere - in the end, we want the users to properly use the package! Wild guess: if you use otherwise normalized data, I don't think you'd be too much far away from the expected results, but you'd be using something in a sub-optimal way. And in full honesty, if one can, one should avoid it +1

This is documented herehttps://icbi-lab.github.io/immunedeconv/articles/immunedeconv.html#input-data but it keeps coming up. Where do you think we should add it to make it more visible? We were thinking about an FAQ section at some point: #61https://github.com/icbi-lab/immunedeconv/issues/61

I could argue it is well visible already, but maybe we could

point out somewhere in the vignette as well (Q&A sounds good)
in the function documentation of the wrapper, we can have a dedicated Details section for that

I mean, in the end it is up to the user to read the fantastic manual 😬

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-963349833, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABT6GSQH3H362A5SCI72MCDUK74TPANCNFSM42RIWNNQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

AGordonRobertson commented 2 years ago

At the bottom of the article at: https://haroldpimentel.wordpress.com/2014/05/08/what-the-fpkm-a-review-rna-seq-expression-units/

"I’ve included some R code below for computing effective counts, TPM, and FPKM…. "

countToTpm <- function(counts, effLen) { rate <- log(counts) - log(effLen) denom <- log(sum(exp(rate))) exp(rate - denom + log(1e6)) }

countToFpkm <- function(counts, effLen) { N <- sum(counts) exp( log(counts) + log(1e9) - log(effLen) - log(N) ) }

fpkmToTpm <- function(fpkm) { exp(log(fpkm) - log(sum(fpkm)) + log(1e6)) }

countToEffCounts <- function(counts, len, effLen) { counts * (len / effLen) }

################################################################################

An example

################################################################################ cnts <- c(4250, 3300, 200, 1750, 50, 0) lens <- c(900, 1020, 2000, 770, 3000, 1777) countDf <- data.frame(count = cnts, length = lens) countDf

count length

1 4250 900

2 3300 1020

3 200 2000

4 1750 770

5 50 3000

6 0 1777

assume a mean(FLD) = 203.7

countDf$effLength <- countDf$length - 203.7 + 1

countDf$tpm <- with(countDf, countToTpm(count, effLength))

countDf$fpkm <- with(countDf, countToFpkm(count, effLength))

with(countDf, all.equal(tpm, fpkmToTpm(fpkm)))

TRUE

countDf$effCounts <- with(countDf, countToEffCounts(count, length, effLength))

countDf

count length effLength tpm fpkm effCounts

1 4250 900 697.3 456667.22 638213.363 5485.44385

2 3300 1020 817.3 302526.21 422794.247 4118.43876

3 200 2000 1797.3 8337.58 11652.150 222.55606

4 1750 770 567.3 231129.74 323014.407 2375.28644

5 50 3000 2797.3 1339.25 1871.663 53.62314

6 0 1777 1574.3 0.00 0.000 0.00000

-- FPKM and TPM are linear

plot(countDf$fpkm, countDf$tpm)

I do not understand what this is -

assume a mean(FLD) = 203.7

countDf$effLength <- countDf$length - 203.7 + 1

Gordon

On Nov 8, 2021, at 09:45, Gordon Robertson @.***> wrote:

The 'Getting started with immunedeconv’ documentation is good!

One small point: the Finotello et al 2018 reference is given as: Finotello, Francesca, and Zlatko Trajanoski. 2018. “Quantifying tumor-infiltrating immune cells from transcriptomics data.” Cancer Immunology, Immunotherapy 0. https://doi.org/10.1007/s00262-018-2150-z.

The DOI seems correct, and the publication web page says that the details should be: Cancer Immunology, Immunotherapyhttps://link.springer.com/journal/262 volume 67, pages 1031–1040.

The 'Run the deconvolution’ section already asks for TPMs - Input Data The input data is a gene × sample gene expression matrix. In general values should be
- TPM-normalized
- not log-transformed.

It might be helpful to tell a reader how to transform RSEM values, FPKMs, or read counts into TPMs. What I’m hoping for is something very simple and clear. E.g. this gives substantial (too much!?) detail. https://haroldpimentel.wordpress.com/2014/05/08/what-the-fpkm-a-review-rna-seq-expression-units/

It says: "If you have FPKM, you can easily compute TPM: ..."

In some runs, EPIC (https://github.com/GfellerLab/EPIC) reports that some samples 'did not converge'. It would be helpful if ‘epic’ in immunedeconv passed such warnings through to the user.

Gordon

On Nov 8, 2021, at 08:41, Federico Marini @.**@.>> wrote:

I agree this should be documented somewhere - in the end, we want the users to properly use the package! Wild guess: if you use otherwise normalized data, I don't think you'd be too much far away from the expected results, but you'd be using something in a sub-optimal way. And in full honesty, if one can, one should avoid it +1

This is documented herehttps://icbi-lab.github.io/immunedeconv/articles/immunedeconv.html#input-data but it keeps coming up. Where do you think we should add it to make it more visible? We were thinking about an FAQ section at some point: #61https://github.com/icbi-lab/immunedeconv/issues/61

I could argue it is well visible already, but maybe we could

point out somewhere in the vignette as well (Q&A sounds good)
in the function documentation of the wrapper, we can have a dedicated Details section for that

I mean, in the end it is up to the user to read the fantastic manual 😬

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-963349833, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABT6GSQH3H362A5SCI72MCDUK74TPANCNFSM42RIWNNQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

federicomarini commented 2 years ago

I am guessing, but the FLD is the fragment length distribution? And that would be something you'd subtract from the length to "make it the effective length". Apart from that: it can be that in some samples the length of the transcript might not really be the same. If using TPM quantifications from salmon or kallisto (as recommended, basically through the original concept of quanTIseq as a whole pipeline), then this aspect is already taken care of.

AGordonRobertson commented 2 years ago

Frederico,

FLD is very likely 'fragment length distribution'.

From what I recall, a mean fragment length of ~200 bp was considered appropriate for Illumina sequencers.

I don’t know why we’d subtract this from length, then add one, to get an 'effective length'. Possibly this corrects for read coverage decreasing near the 5’ and 3’ ends of a gene.

countDf$effLength <- countDf$length - 203.7 + 1

Gordon

On Nov 8, 2021, at 13:22, Federico Marini @.**@.>> wrote:

I am guessing, but the FLD is the fragment length distribution? And that would be something you'd subtract from the length to "make it the effective length". Apart from that: it can be that in some samples the length of the transcript might not really be the same. If using TPM quantifications from salmon or kallisto (as recommended, basically through the original concept of quanTIseq as a whole pipeline), then this aspect is already taken care of.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-963586732, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABT6GSRJUIGGY2JMB5RUQI3ULA5SDANCNFSM42RIWNNQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

federicomarini commented 2 years ago

I'd suggest you can have a look at this very informative post by @rob-p for an explanation on the effective length 😉

http://robpatro.com/blog/?p=235#efflen

AGordonRobertson commented 2 years ago

Thanks! This is informative. G

On Nov 8, 2021, at 13:35, Federico Marini @.**@.>> wrote:

I'd suggest you can have a look at this very informative post by @rob-phttps://github.com/rob-p for an explanation on the effective length 😉

http://robpatro.com/blog/?p=235#efflen

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/icbi-lab/immunedeconv/issues/62#issuecomment-963596536, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABT6GSWWCCLCTQ55CNSLLF3ULA7DBANCNFSM42RIWNNQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

grst commented 2 years ago

@federicomarini, can this be closed?

federicomarini commented 2 years ago

Guess so - ended up spacing across a couple of topics in the end, but seems done to me

omnideconv / immunedeconv

get a 'no file found' error for deconvolute_quantiseq() with btotalcells = TRUE #62

Q: Do I have to provide my expression data formatted as TPMs? Why is that so?

A: The expression data is indeed expected to be provided as TPM values. quantiseqr might warn you if you are providing a different format (counts, normalized counts) - this does not mean that it will trigger an error as the computation is still able to proceed.

An example

count length

1 4250 900

2 3300 1020

3 200 2000

4 1750 770

5 50 3000

6 0 1777

assume a mean(FLD) = 203.7

TRUE

count length effLength tpm fpkm effCounts

1 4250 900 697.3 456667.22 638213.363 5485.44385

2 3300 1020 817.3 302526.21 422794.247 4118.43876

3 200 2000 1797.3 8337.58 11652.150 222.55606

4 1750 770 567.3 231129.74 323014.407 2375.28644

5 50 3000 2797.3 1339.25 1871.663 53.62314

6 0 1777 1574.3 0.00 0.000 0.00000

-- FPKM and TPM are linear

assume a mean(FLD) = 203.7