Closed bmumbler closed 6 years ago
Hello,
Seems to be a missing value, like a NA
or Inf
where a number is expected. I could speculate, but it would be more helpful if you could show me the actual results rows for your example gene:
avc_dtu$Genes['Tnc', ]
and
avc_dtu$Transcripts['Tnc', ]
Also, are you using the latest release (0.6.2)?
Thanks! Kimon
Yes, using 0.6.2.
> avc_dtu$Genes['Tnc', ]
parent_id elig sig elig_fx quant_reprod rep_reprod DTU transc_DTU known_transc detect_transc elig_transc maxDprop pval
1: Tnc TRUE TRUE TRUE TRUE TRUE TRUE TRUE 12 10 5 0.3039531 1.54854e-26
pval_corr quant_p_mean quant_p_stdev quant_p_min quant_p_max quant_na_freq quant_dtu_freq rep_p_mean rep_p_stdev
1: 7.161138e-26 4.114993e-21 2.253217e-20 1.012475e-46 1.23415e-19 0 1 2.901242e-29 5.802483e-29
rep_p_min rep_p_max rep_na_freq rep_dtu_freq
1: 7.770457e-51 1.160497e-28 0 1
> avc_dtu$Transcripts['Tnc', ]
target_id parent_id elig_xp elig sig elig_fx quant_reprod rep_reprod DTU gene_DTU meanA meanB
1: gene.4510.0.1-x0-0 Tnc TRUE TRUE FALSE FALSE TRUE TRUE FALSE TRUE 4.1283635 7.66126686
2: gene.4515.0.0-x7-0 Tnc FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE 0.0000000 0.00000000
3: gene.4515.0.1-x7-0 Tnc FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE 4.3204617 1.32414564
4: gene.4515.0.3-x7-0 Tnc TRUE TRUE TRUE FALSE FALSE TRUE FALSE TRUE 16.7635820 24.42903596
5: gene.4824.0.0-x4-0 Tnc TRUE TRUE TRUE TRUE FALSE FALSE FALSE TRUE 26.9109557 3.30507282
6: gene.4824.0.1-x4-0 Tnc FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE NA NA
7: gene.4824.0.2-x4-0 Tnc TRUE TRUE TRUE TRUE FALSE FALSE FALSE TRUE 41.0505430 18.97642905
8: gene.5779.0.0-x3-0 Tnc FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE 0.0000000 0.03805255
9: gene.5779.0.3-x3-0 Tnc TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 0.1413028 28.44707323
10: gene.6277.0.1-x1-0 Tnc FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE 3.4234235 1.78699362
11: gene.6277.0.5-x1-0 Tnc FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE 0.4867365 3.69281580
12: gene.6277.0.6-x1-0 Tnc FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE 1.9905779 3.49296789
stdevA stdevB sumA sumB log2FC totalA totalB propA propB Dprop pval
1: 5.8383876 2.87065801 8.2567269 15.32253372 0.8920130 198.4319 186.3077 0.041609878 0.0822431556 0.0406332774 9.495871e-02
2: 0.0000000 0.00000000 0.0000000 0.00000000 NaN 198.4319 186.3077 0.000000000 0.0000000000 0.0000000000 NA
3: 5.3925600 1.87262472 8.6409234 2.64829127 -1.7061237 198.4319 186.3077 0.043546041 0.0142146094 -0.0293314317 NA
4: 6.9114891 0.72598810 33.5271641 48.85807193 0.5432665 198.4319 186.3077 0.168960561 0.2622439659 0.0932834044 2.558355e-02
5: 1.7691451 4.34137610 53.8219114 6.61014565 -3.0254397 198.4319 186.3077 0.271236194 0.0354797220 -0.2357564716 1.547250e-11
6: NA NA 0.0000000 0.00000000 NaN 198.4319 186.3077 0.000000000 0.0000000000 0.0000000000 NA
7: 58.0542342 9.66682571 82.1010859 37.95285810 -1.1131928 198.4319 186.3077 0.413749447 0.2037106180 -0.2100388286 7.257556e-06
8: 0.0000000 0.05381444 0.0000000 0.07610511 Inf 198.4319 186.3077 0.000000000 0.0004084915 0.0004084915 NA
9: 0.1399868 4.09014376 0.2826056 56.89414646 7.6533463 198.4319 186.3077 0.001424194 0.3053773106 0.3039531161 2.571871e-21
10: 4.5721341 2.52719059 6.8468470 3.57398723 -0.9379053 198.4319 186.3077 0.034504771 0.0191832495 -0.0153215214 NA
11: 0.6883438 2.79840571 0.9734730 7.38563161 2.9235084 198.4319 186.3077 0.004905829 0.0396421154 0.0347362860 NA
12: 2.8151023 4.24082789 3.9811558 6.98593578 0.8112660 198.4319 186.3077 0.020063084 0.0374967622 0.0174336779 NA
pval_corr quant_p_mean quant_p_stdev quant_p_min quant_p_max quant_na_freq quant_dtu_freq rep_p_mean rep_p_stdev
1: 1.390961e-01 1.960229e-01 2.953422e-01 4.890572e-09 8.908426e-01 0.03333333 0.0000000 2.341085e-01 4.093500e-01
2: NA NA NA NA NA NA NA NA NA
3: NA NA NA NA NA NA NA NA NA
4: 4.374234e-02 2.250759e-01 3.038488e-01 8.652778e-09 8.536735e-01 0.00000000 0.1000000 1.236044e-01 1.986702e-01
5: 2.041982e-10 6.565879e-07 2.067562e-06 3.820437e-22 9.104753e-06 0.00000000 0.8333333 2.373610e-05 4.747219e-05
6: NA NA NA NA NA NA NA NA NA
7: 2.844246e-05 3.501006e-02 1.095140e-01 1.813633e-16 4.546177e-01 0.00000000 0.5000000 6.626238e-08 1.325245e-07
8: NA NA NA NA NA NA NA NA NA
9: 1.423744e-19 1.294536e-13 6.952823e-13 2.221477e-30 3.810496e-12 0.00000000 1.0000000 9.858301e-14 1.971639e-13
10: NA NA NA NA NA NA NA NA NA
11: NA NA NA NA NA NA NA NA NA
12: NA NA NA NA NA NA NA NA NA
rep_p_min rep_p_max rep_na_freq rep_dtu_freq
1: 4.989867e-06 8.448317e-01 0 0.00
2: NA NA NA NA
3: NA NA NA NA
4: 3.073885e-07 4.164730e-01 0 0.00
5: 3.847876e-24 9.494438e-05 0 0.50
6: NA NA NA NA
7: 6.986727e-23 2.650492e-07 0 0.75
8: NA NA NA NA
9: 7.767124e-28 3.943288e-13 0 1.00
10: NA NA NA NA
11: NA NA NA NA
12: NA NA NA NA
I was able to recreate this problem with simulated data. The culprit is the isoforms with 0 expression in both conditions, although I'm not sure why yet.
Thank you for finding and reporting this bug! I'll get right on it.
I'm looking at rows 2 and 6 of your example. Both isoforms are not expressed, but there are differences in how their information is recored:
Row 2 is what I'd expect, with 0
s filled in for the mean. There is no reason why this should not plot.
Row 6, however, has NA
for the means. This would be a problem indeed for plotting. The most likely cause of this, is that the isoform ID in row 6 did not exist in the annotation used to quantify the data. Or if it did exist, kallisto did not report it in the end (but then why did it report the one in row 2?).
Can you verify that you used the identical annotation for both kallisto and rats?
I could (and probably will) patch RATs to explicitly assign 0
over the NA
s, and to generate warnings when this happens, but I'll also need to catch the case of IDs that are present in the quantifications and missing from the annotation given to RATs (currently silently ignored).
Even then, there may still be more, hidden, discrepancies, so mixing up annotations is really something one should never do.
Hello,
Did you verify what annotations were used in your analysis? Were they different?
If you want to try my proposed fix, you can get the development version 0.6.2-2:
devtools::install_github("bartongroup/rats", ref="development")
If it works well for you, I can push it out as a release.
The fix does two things:
reckless=TRUE
in call_DTU()
, in which case NA
counts are explicitly replaced with 0
. For the record, this override is provided for special use cases only. I do not recommend ignoring annotation issues. It is best to go back and ensure your workflow protocol consistently uses the same annotation in all the steps.
Hi, I just redid everything (using the same 0.6.2) to make sure I was using the same annotation. It seems to be working now; will comment again if I run into it again. Thank you!
Okay, then I'll consider this resolved. Feel free to re-open the issue in the future if there's anything you'd like to add to this topic!
Hello, I'm looking at DTU using kallisto output and am having trouble getting plots.
When I try to plot genes that appear in the primary switch list,
What is causing this?
Thank you!