Open tomthun opened 1 year ago
Hi :), can you give me the names of precursors with this behaviour?
Here are some precursors extracted from the curated CustomDf.aq_reformat.tsv.ion_intensities.tsv for case number 1:
And here a sniplet for case 2 (note that these precursors are NOT found in the final output but exist previously in the CustomDf.aq_reformat.tsv):
For the latter case i find overall 599 precursers which are missing.
Hi, thanks for looking into this so deeply!
1) so in general I don't use precursors with only a single intensity value downstream. The reason is that directLFQ is a ratio-based method and you cannot calculate ratios with only one value. So even if they are in the .ion_intensities.tsv, they don't end up in the protein. But indeed, I saw some precursors with only one value being reported in the .ion_intensities.tsv while others are set to 0 as you show. So I should fix this, but I don't think it affects protein quantification currently.
2)Indeed in the example for Q15149 it's quite strange, these precursors should be in there. I will have to look into this and fix it in a future release. For the non Q15149 examples it's fine again, as they are only single intensities. I noticed that Q15149 has many many precursors, so a few missing will have virtually no effect on the quantification. Can you check if there are examples for precursors that a)are missing b)have more than one intensity value c) belong to a protein with less than 7 precursors.
Best Constantin
There are 4 cases with this behaviour: P78527, Q09666, Q14204, Q15149, however none belong to a protein with less than 7 precursors.
Although, there is still the example O00268 with overall 3 precursors (note that of those 3, 2 have only one valid value but for replicate 3) of which one is missing with two valid values:
repli1 repli2 repli3 1534.2 1999.24 nan
I can send you the list with the missing Ions also if you have not created it by yourself by now. Thanks for looking into this! Best,
Tom
Any updates?
Hi Tom,
I will fix it in the next release, where I will address also a few other things. I cannot tell you exactly when this will be out unfortunately, as I'm currently busy with a few other projects. I will let you know when it is out.
The problem we are talking about is a problem of filtering, so in a few edge cases a bit too much gets filtered out. This definitely needs to be fixed but is very unlikely to hamper any biological analysis. So I think you can go forward with any biological analysis you are doing.
Best Constantin
Describe the bug Some row entries are 0 for all replicates in the CustomDf.aq_reformat.ion_intensties even though there are valid values in the original CustomDf.aq_reformat input dataframe. Furthermore, there are row entries missing when comparing both .tsv files which i did not expect to happen.
To Reproduce see #16 for input and output files.
Expected behavior
Version (please complete the following information):