rformassspectrometry / MetaboCoreUtils

Core utilities for metabolomics.
https://rformassspectrometry.github.io/MetaboCoreUtils/index.html
8 stars 6 forks source link

invalid elements/isotopes in countElements #64

Closed sgibb closed 1 year ago

sgibb commented 1 year ago

While looking at #63 I recognize that countElements silently drops invalid elements if at least one valid element was found because of its very complex regular expression (that matches mostly valid element names):

countElements("C1Z")
# $C1Z
# C 
# 1 
countElements("Foo")
# $Foo
# F 
# 1 

In contrast if the invalid element is given without any valid one:

countElements("Z")

# $Z
# named integer(0)
#
# Warning message:
# In (function (xx, rr)  :
#  The following names are not valid and are dropped: 1

Because of that containsElements("C1", c("Z", "C1Z")) yield c(TRUE, TRUE) instead of c(FALSE, FALSE). While it is easy to test if countElements returns an integer(0) (e.g. for "Z") it is not possible to test for dropped elements if there is at least one valid element (e.g. for "C1Z", we could test pasteElements(countElements("C1Z")) == "C1Z"), which would be error prone and slow).

Shouldn't we return NA if at least one element name is invalid (instead of currently silently dropping it or throw a warning if there is at least one valid element)?

jorainer commented 1 year ago

Agree - reporting NA in these cases would make sense (and is in line with the general NA handling in R).

jorainer commented 1 year ago

I'm perfectly fine with your solution! Does that mean we need to revert the fix in #62 ? in any rate, please update version and add NEWS and then merge. Thanks!

sgibb commented 1 year ago

No #62 is still needed (we could maybe drop the is (anyNA(x)) check in .sum_elements but adductFormula doesn't generate a named list and will fail).

Closed by #65