yufree / xMSannotator

fork from https://sourceforge.net/projects/xmsannotator/
https://sourceforge.net/projects/xmsannotator/
9 stars 13 forks source link

Bug/ Error with using CustomDB in multilevelannotation #3

Closed hhabra closed 5 years ago

hhabra commented 5 years ago

Hi,

I found a bug that leads to failure of xMSannotator when using a customDB.

So according the instructions for customDB in the help menu for "multilevelannotation"

customDB Custom database. Run: data(custom_db); head(custom_db) to see more details on formatting. Set to NA to turn off this option

I run data(custom_db) and find a dataframe with the column order: "ID", "Name", "MonoisotopicMass", and "Formula"

The order of the columns is very important, because within multilevelannotation(), there is this piece of code on line 383:

mz_search_list <- lapply(1:dim(inputmassmat)[1], function(m){
                           adduct_names <- as.character(adduct_names)

                           mz_search_list <- get_mz_by_monoisotopicmass(monoisotopicmass = as.numeric(as.character(inputmassmat[m,4])), dbid = inputmassmat[m, 1],name = as.character(inputmassmat[m,2]),  formula = as.character(inputmassmat[m,3]), 
queryadductlist = adduct_names, adduct_table = adduct_table)

                           return(mz_search_list)
})

In this bit of code, the function assumes that the "monoisotopicmass" should be the 4th column and the formulas are the 3rd column. It is the other way around. This error leads to a mz_search_list with all for the mz column. Unfortunately this error is not caught for at least one hundred more lines, and it took many hours to track the source of the problem back to this above error.

I would recommend either one of the following:

1) Switching the column order of columns 3 and 4 in the object custom_DB.rda in an updated version of the package 2) Switching the order of 3 & 4 in the code above in the get_mz_by_monoisotopicmass() function call 3) Using the actual names of the columns instead of column number 4) Having the software check the legitimacy of the resulting mass_search list (e.g. whether there are "NA" values in the m/z values it's searching, thereby leading to an early exit. As it is, the program executes the rather slow parLapply step on line 491 before finding an empty (levelB_res) and exiting.

Thanks!

yufree commented 5 years ago

Hi, I see your point and it's a bug. I fixed this in this repo by 2 to make minimum change. You could try to re-install the package from here by devtools::install_github('yufree/xMSannotator')

I hope @kuppal2 could fix this in sourceforge repo if possible.

Thanks!

kuppal2 commented 5 years ago

Hi,

Have you looked at the source forge website? There is a custom database page with multiple examples: https://sourceforge.net/projects/xmsannotator/files/CustomDatabases/

Please do not update the code with option 2 as it will cause other issues. The problem is not with the code, but the Rda file. It is easy to just switch the columns in option 1.

Thank you, Karan

yufree commented 5 years ago

Hi Karan,

Thanks for your reply. I have changed to option 1.

Thanks!