rickhelmus / patRoon

Workflow solutions for mass-spectrometry based non-target analysis.
https://rickhelmus.github.io/patRoon/
GNU General Public License v3.0
61 stars 18 forks source link

Advice for validating non-target analysis with known compounds #113

Open akogler opened 1 month ago

akogler commented 1 month ago

Hi, this is more of a general question than an issue. I am using patRoon for non-target analysis but I spiked some known compounds into all of my samples to validate the non-target analysis. During formula annotation, the formulas corresponding to the known compounds show up as candidate formulas. However, during compound annotation, only one of the known compounds shows up. I have tried to adjust some parameters for compound annotation and MS peak list generation, but none of the changes have improved the output. Do you have additional suggestions for troubleshooting? Thank you!

Here is an example of my compound annotation command: compounds <- generateCompounds(fGroupsSelSusp, mslists, "metfrag", method = "CL", dbRelMzDev = 5, fragRelMzDev = 5, fragAbsMzDev = 0.002, database = "pubchem", topMost = 2500, timeoutRetries = 5, errorRetries = 5, maxCandidatesToStop = 2500)

I have varied dbRelMzDev, fragRelMzDev, and fragAbsMzDev and have also tried different combinations of the scoring types allowed for Metfrag initially with equal weights but then also with different weights.

For MS peak list generation, I use the following command: mslists <- generateMSPeakLists(fGroupsSelSusp, algorithm = "mzr", maxMSRtWindow = 5, precursorMzWindow = NULL, topMost = NULL, avgFeatParams = avgMSListParams, avgFGroupParams = avgMSListParams)

I have adjusted clusterMzWindow, topMost, pruneMissingPrecursorMS, minIntensityPre, and minIntensityPost when setting avgMSListParams.

Thanks for any pointers!

rickhelmus commented 1 month ago

Hello @akogler ,

Sorry for the slow response, just returned from holidays :-)

There can be a few reasons for this.

Firstly, as the PubChem database is huge and likely results in many uninteresting isomeric candidates, I would suggest to switch to PubChemLite, i.e. database="pubchemlite".

Secondly, it is handy to do a suspect screening workflow with the compounds you spiked. Here the annotateSuspects() function is used to give more information on how the suspects (ie your spiked compounds) compare with the annotation data. For instance, if most of your spiked compounds get ID levels above three, then maybe something goes wrong during the compound annotation. The annotateSuspects() function will create log text files that you can inspect to get more details.

Thanks, Rick