RECETOX / MFAssignR

The MFAssignR package was designed for multi-element molecular formula (MF) assignment of ultrahigh resolution mass spectrometry measurements. A number of tools for internal mass recalibration, MF assignment, signal-to-noise evaluation, and unambiguous formula selections are provided.
GNU General Public License v3.0
0 stars 3 forks source link

Recal: Error... replacement has 17 rows, data has 1050 #47

Closed KristinaGomoryova closed 2 months ago

KristinaGomoryova commented 2 months ago

This is related to this issue: https://github.com/RECETOX/galaxytools/issues/576 ; apparently the RECETOX version of MFAssignR has a bug in Recal function, which works fine in the original version

The problem is here:

processKnown <- function(rest, known, kmd_col, z_col, num_col, type, step_limit, remove_indices) {
  names(known)[2] <- "base_mass"

  step_result <- merge(rest, known, by.x = c(kmd_col, z_col), by.y = c(kmd_col, z_col))
  step_result[[num_col]] <- round((step_result$Exp_mass - step_result$base_mass) / step_limit)
  step_result[[type]] <- step_result[[type]] + step_result[[num_col]]
  step_result$Type <- type
  # step_result$form <- paste(step_result[c("C", "H", "O", "N", "S", "P", "E", "S34", "N15", "D", "Cl", "Fl", "Cl37", "M", "NH4", "POE", "NOE")], sep = "_")
  step_result$form <- paste(step_result$C, step_result$H, step_result$O, step_result$N, step_result$S,  step_result$P, step_result$E, step_result$S34, step_result$N15, step_result$D, step_result$Cl, step_result$Fl, step_result$Cl37, step_result$M, step_result$NH4, step_result$POE, step_result$NOE, sep = "_")
  step_result <- step_result[abs(step_result[[num_col]]) <= step_limit, ]
  step_result <- step_result[-remove_indices, ]
  return(step_result)
}

One point:

step_result$form <- paste(step_result[c("C", "H", "O", "N", "S", "P", "E", "S34", "N15", "D", "Cl", "Fl", "Cl37", "M", "NH4", "POE", "NOE")], sep = "_") needs to be replaced by this: step_result$form <- paste(step_result$C, step_result$H, step_result$O, step_result$N, step_result$S, step_result$P, step_result$E, step_result$S34, step_result$N15, step_result$D, step_result$Cl, step_result$Fl, step_result$Cl37, step_result$M, step_result$NH4, step_result$POE, step_result$NOE, sep = "_")

Additionally, we need to pass H instead of H2 in Step3:

Step3 <- processKnown(Rest, RecalList[c(1:21, 27, 28)], "KMD_H2", "z_H2", "H2_num", "H", step_H2, c(10, 31))

And then, the problem is that the colnames or Step2 and Step3 don't match:

     colnames(Step2)
 [1] "KMD_O"       "z_O"         "Abundance.x" "Exp_mass"    "NM"          "KM_O"        "KM_H2"       "KMD_H2"     
 [9] "z_H2"        "Abundance.y" "base_mass"   "C"           "H"           "O"           "N"           "S"          
[17] "P"           "E"           "S34"         "N15"         "D"           "Cl"          "Fl"          "Cl37"       
[25] "M"           "NH4"         "POE"         "NOE"         "Z"           "C13_mass"    "O_num"       "Type"       
[33] "form"       
>     colnames(Step3)
 [1] "KMD_H2"      "z_H2"        "Abundance.x" "Exp_mass"    "NM"          "KM_O"        "KMD_O"       "z_O"        
 [9] "KM_H2"       "Abundance.y" "base_mass"   "C"           "H"           "O"           "N"           "S"          
[17] "P"           "E"           "S34"         "N15"         "D"           "Cl"          "Fl"          "Cl37"       
[25] "M"           "NH4"         "POE"         "NOE"         "Z"           "C13_mass"    "H2_num"      "Type"       
[33] "form"  

So we need to pass them differently in the function or do colnames() in processKnown()

KristinaGomoryova commented 2 months ago

This is now resolved in https://github.com/RECETOX/MFAssignR/pull/46

Now both recetox and skschum versions provide the same output