zamboni-lab / SLAW

Scalable and self-optimizing processing workflow for untargeted LC-MS
GNU General Public License v2.0
25 stars 3 forks source link

Annotation step disregards retention time? #28

Closed meowcat closed 1 year ago

meowcat commented 1 year ago

It seems to me that the annotation step groups together peaks with vastly different retention times (e.g. 20, 25, 15 minutes).

I observe this both with zambonilab/slaw:latest and zambonilab/slaw:dev.

htmonkey commented 1 year ago

I don't remember seeing the problem you mention, but I can imagine it could occur depending on how it was implemented by Alexis. I'll have to check in detail the code. In principle, there is a simple workaround to the problem, which won't require adding a new parameter. For the tolerance, we can likely use rt/mz boxes that we calculate for each peak individually.

meowcat commented 1 year ago

Since it's not my data, I cannot yet provide a reprex, but I will see if I can reproduce it with a toy dataset.

htmonkey commented 1 year ago

In the meantime, I implemented everything in our internal version, but I need data for testing. As you pointed out, SLAW builds a correlation matrix across all samples. This is done by tricking msClique into thinking it's looking at EICs while it's getting the full data matrix. Hence, the true RT information was neglected. I know added a parameter that defines a max RT gap in the graph generation.

I will need data for testing. I can't remember spotting such a problem on our case, but our experiments are less suited to include accidental correlations across the time range. However, I can imagine situations where this might be more likely, e.g. when the sample amount varies dramatically across the study.

Once it's tested, I'll push it to the public dev.