Yiguan / mutation_literature

We summerised the (germline) de novo mutation rates across the eukaryotes (until 21 September 2022).
3 stars 0 forks source link

address possible data loss in conversion of bb$Events #5

Open arlin opened 1 year ago

arlin commented 1 year ago

Line 169 gives a warning because some non-numeric values in bb$Events are getting coerced. Here are the 5 non-integer values before the conversion on line 169:


> bb$Events[grep("[^0-9]", bb$Events)]
[1] "28.8"  "1164?" "704?"  "73?"   "27?" 

The call to as.integer will handle the NAs in bb$Events as expected, but it is going to convert the 28.8 to 28 (not 29) and it is going to convert the values with question marks to NA:


> # example of what happens
> nonintegers <- bb$Events[grep("[^0-9]", bb$Events)]
> tmp <- c(1, 2, 3, NA, NA, nonintegers)
> tmp
 [1] "1"     "2"     "3"     NA      NA      "28.8"  "1164?" "704?"  "73?"   "27?"  
> as.integer(tmp)
 [1]  1  2  3 NA NA 28 NA NA NA NA
Warning message:
NAs introduced by coercion 

This code will work if you want to round 28.8 to 29 and keep the values with questions marks:

> round(as.numeric(gsub("\\?", "", tmp)))
 [1]    1    2    3   NA   NA   29 1164  704   73   27