rickhelmus / patRoon

Workflow solutions for mass-spectrometry based non-target analysis.
https://rickhelmus.github.io/patRoon/
GNU General Public License v3.0
58 stars 17 forks source link

Errors in DA formula calculation and during loading MSP libraries #72

Closed MK3491 closed 1 year ago

MK3491 commented 1 year ago

Hi, Thanks for creating such a great package.
With the recent version 2.2, I have two minor issues, and I am unsure whether they are due to errors in my local patRoon setup or in the code. 1) When calculating formulas using Bruker DataAnalysis, I am getting:

Error in .jcall(molecularFormula, "V", "setCharge", D) : 
  RcallMethod: invalid object parameter

Obviously, no formulas get calculated. However, I can see that DA has started, and a correct MS data file has loaded. 2) When loading any library in the MSP format, the records seemingly get loaded fine, except that at the end, I get:

Error:` Index out of bounds: [index='DB_ID'].

Unfortunately, that happens with every MSP file I have tried so far (about 20 of them, sizing from 25 to 554041 spectra). Please let me know if I can be of any help with resolving these issues. I would love to see the DA formula calculation working again. Best regards, MK3491

rickhelmus commented 1 year ago

Hello,

Thanks for bringing this up, I think there aren't many people that use the DataAnalysis algorithms via patRoon, so it's good to get some feedback.

I think the first issue is related to mass determination of candidate fomulae, I wonder if DataAnalysis is giving a format that is incompatible for the related rcdk function (get.formula()). It would be helpful if you could somehow narrow the problem down, e.g. by calculating only a small subset of your data. Alternatively, would you be able to share some data where you do see the error?

For the 2nd issue: the current MSP loader was mainly optimized and tested for MassBank files (.eu/MoNA), are you getting the file from somewhere else? If you can let me know where you got the file from (or share it) then I can see where things go wrong.

Thanks, Rick

MK3491 commented 1 year ago

Hi, I am not sure how to proceed with issue #1 because now I can't reproduce it - the workflow breaks during ms peak list generation with a different error message: """

mslists <- generateMSPeakLists(fGroups_filtered,

  • "brukerfmf",
  • avgFGroupParams = avgMSListParams) Error in if (DA[["Analyses"]][[i]][["Path"]] == analysis) return(i) : argument is of length zero """ (which traces back to getFIndex())

Anyway, I think there must be something wrong with these data files: they were processed a while ago with FindMolecularFeatures, and filtered by hand (removal of features without MS2). The resulting mess was now loaded into the patRoon and processed without rerunning FMF. In any case, I don't think there is any point in investigating issue #1 any further - my apologies for wasting your time As for issue #2, I included three small MSP files. They originated from the MS-DIAL web page, GNPS public libraries, and The Golm Metabolome Database. The first two give me the same error. The GlomMD file uses a different way of storing [m/z, intensity] pairs and produces another error message. Best regards,MK3491

------- Original Message ------- On Monday, March 20th, 2023 at 10:32 AM, rickhelmus @.***> wrote:

Hello,

Thanks for bringing this up, I think there aren't many people that use the DataAnalysis algorithms via patRoon, so it's good to get some feedback.

I think the first issue is related to mass determination of candidate fomulae, I wonder if DataAnalysis is giving a format that is incompatible for the related rcdk function (get.formula()). It would be helpful if you could somehow narrow the problem down, e.g. by calculating only a small subset of your data. Alternatively, would you be able to share some data where you do see the error?

For the 2nd issue: the current MSP loader was mainly optimized and tested for MassBank files (.eu/MoNA), are you getting the file from somewhere else? If you can let me know where you got the file from (or share it) then I can see where things go wrong.

Thanks, Rick

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

rickhelmus commented 1 year ago

Hello,

Issue 1: unfortunately, this sounds like an issue I have been hitting recently on a system with Windows 11. I'm not really sure where things go wrong, either DataAnalysis itself or the R package interfacing with it (RDCOMClient). For now, I managed to get things work either by rebooting (ugh) or using base R directly (i.e. not via RStudio)...

Issue 2: I missed any attachments, but had a look at an MSP from MS-DIAL. One problem seems to be that there are different 'flavors' of MSP. In this case, the file seems to lack 'DB#' fields, which are used by patRoon to distinguish spectra records. I guess I could workaround this by automatically generating unique identifiers. I'm a bit packed now with other things but I'll try to have a look at this later. I didn't check the others, but if GlomMD uses a different way of storing spectral data, that would also need specific support in patRoon.

I was hoping everyone would stick with MassBank ;-)

rickhelmus commented 1 year ago

I just pushed some changes that should help with this issue. I tested several MSP files from MS-DIAL, GNPS and Glom, and they seem to load fine now. Any more feedback would be appreciated though :-) For now I'm closing the issue, but feel free to re-open if needed!