RECETOX / recetox-xMSannotator

This is a custom adaptation of the original version of xMSannotator. It is a complete rewrite of the original functionality, following the same program structure.
GNU General Public License v3.0
5 stars 5 forks source link

Error during theoretical isotopic pattern computation of ions #91

Open hechth opened 2 years ago

hechth commented 2 years ago

See the galaxy output below.

Loading required package: Rcpp Joining, by = "peak" Error in isotopeList[i, 3] <- exactMass : number of items to replace is not a multiple of replacement length Calls: source ... compute_isotopic_pattern -> get.formula -> .cdkFormula.createObject In addition: Warning messages: 1: executing %dopar% sequentially: no parallel backend registered 2: In compute_peak_modules(peak_intensity_matrix = peak_intensity_matrix, : Unable to estimate soft threshold. Using fallback value. 3: In labels2colors(allLabels) : labels2colors: Number of labels exceeds number of avilable colors. Some colors will be repeated 2 times. Execution halted

@maximskorik this seems to be happening when using the HMDB database with the added QC compounds.

maximskorik commented 2 years ago

The computation of theoretical isotopic pattern fails when computing the pattern of ions (with formula literally M+ or M-). rcdk's get.formula can't initialize a molecule object from such a string. Fixing this should not be a problem as rcdk computes the pattern from the isotopic occurrences of individual atoms, so just stripping the formulae of {+,-} symbols will likely do the job.

The more significant issue here is how to deal with naturally occurring ions during simple annotation. As discussed with @hechth, if we pass [M+] to the simple annotation as a possible adduct, it will increase the computational time by some factor. match_by_mass function will end up doing one more iteration of matching every peak to each database compound, which is unnecessary because only a small subset of compounds naturally occur in the form of ions. The possible solution would be to add an additional step to the simple annotation, which will only compare measured peaks to ions from the compound database.

hechth commented 2 years ago

Also, treating all compounds as possible [M+] options will introduce a lot of false positives.

@ElliottJP what's your hint in this? How often do we observe [M+] ions of compounds that are not already ions before being ionised in the MS?

hechth commented 2 years ago

@maximskorik Is this handled or fixed by now? I can't remember whether we already addressed this or not.

maximskorik commented 2 years ago

It's not fixed yet as we did not decide how to deal with natural ions during the simple annotation. The temporary way of fixing it is to remove the "C5H13ClN+" entry from the compound table since that's the only charged one in there. Also stripping +/- signs from the formulae during the isotope pattern computing should fix it.