MonashProteomics / FragPipe-Analyst

GNU General Public License v3.0
9 stars 4 forks source link

Related issues of missing values and row-order when merging (LFQ and FragPipe Analyst, DEP, DEP2)? #94

Closed 1Moe closed 3 months ago

1Moe commented 3 months ago

Hi there,

I have been working with DEP and DEP2 and also found the FragPipe and LFQ Analyst tools to be using similar functions. Your test_limma() function looks similar to the DEP2::test_diff() function and I was wondering if these issues have all been addressed.

https://github.com/mildpiggy/DEP2/issues/3 https://github.com/arnesmits/DEP/issues/29 https://github.com/arnesmits/DEP/issues/2

best

Hailey-Z commented 3 months ago

Thanks for bringing up this potential issue @1Moe. LFQ Analyst performs a customized filter_missval() before the imputation or test_limma(). As mentioned in arnesmits/DEP#29, fillter_missval sorts rownames automatically in alphabetical order, and add_rejections() later sorts on the name column without reordering rownames. If performing the two steps, it will avoid mismatched rownames and names. If you don't want to remove a high proportion of missing values, you can also set the sort = FALSE of merge function(default is TRUE) used in (add_rejections) to avoid the alphabetical sort.

In FragPipe-Analyst, it doesn't include any auto-sort steps in the relevant functions, so I assume it also avoids the mismatch issues. I did some tests on that, @hsiaoyi0504 Leo, would you mind verifying that again, please? Thanks.

hsiaoyi0504 commented 3 months ago

Thanks @Hailey-Z @1Moe We actually didn't use filter_missval or any derived function long ago. Maybe I should clean up the code a bit. https://github.com/MonashProteomics/FragPipe-Analyst/blob/5c0ebe76ce06779dd5d0f8221eeaa406ee8af38a/server.R#L604-L634