rformassspectrometry / MsBackendMsp

Mass Spectrometry Data Backend for MSP Files
https://rformassspectrometry.github.io/MsBackendMsp/
5 stars 3 forks source link

Error in if (is.unsorted(ms[, 1L])) ms <- ms[order(ms[, 1L]), ]: missing value where TRUE/FALSE needed #13

Open kozo2 opened 1 year ago

kozo2 commented 1 year ago

I tried to read my own msp file as follows and got the following error.

> library(MsBackendMsp)
> sp <- Spectra("./myLibrary.msp", source = MsBackendMsp())
Start data import from 1 files ... Error: BiocParallel errors
  1 remote errors, element index: 1
  0 unevaluated and other errors
  first remote error:
Error in if (is.unsorted(ms[, 1L])) ms <- ms[order(ms[, 1L]), ]: missing value where TRUE/FALSE needed
In addition: Warning message:
In serialize(data, node$con) :
  'package:stats' may not be available when loading
> 

Let me know if there is anything I should check.

jorainer commented 1 year ago

Sorry, somehow I overlooked this issue. Could you eventually provide a (minimal) example msp where this is not working? I've observed similar things for msp that provide, in addition to m/z and intensity values also annotations for peaks.

CLUES-Emory commented 7 months ago

Hi, also came across this error reading .MSP files I had created. The cause of the error was one of the fields I added not having a colon separator, e.g. my file had the field "INSTRUMENT Thermo Exploris240". Once I updated the field to "INSTRUMENT: Thermo Exploris240" I was able to read the file with no issues.

sneumann commented 7 months ago

Hi @CLUES-Emory and @kozo2, thanks for reporting ! Could you send a (minified) broken example MSP ? At the minimum, the MsBackendMSP should throw a more informative error message. Also, can you report which software (and software version) created the broken MSP files ? Yours, Steffen

kozo2 commented 7 months ago

I'm sorry, I forgot that I created this issue and ended up losing the msp 🙇 I would appreciate it if @CLUES-Emory could share the data with us 🙏

CLUES-Emory commented 7 months ago

Hi @sneumann and @kozo2, See attached for a working and non-working MSP file (I had to change the file extension to .txt so Github would allow the upload, but you can change it back to .msp). The error was due to the lack of a colon after INSTRUMENT. Once I corrected this, the file was able to load with no issues. However, there still are a few quirks I've noticed. For example, the MSBackendMSP always seems to set the MSLevel to 2, even though the MSP field is for MSLevel 1 (this is GC data).

The MSP file itself was made using code I created, copying and pasting below for reference. Thanks!

#Write msp file 
 cat(
    paste("NAME", clust_ii, sep=": "),
    "msLevel: 1",
    paste("RETENTIONTIME", round(as.numeric(rc_cluster_ft[ii,2]),2), sep=": "),
    paste("RETENTIONINDEX", round(as.numeric(rc_cluster_ft[ii,2]),2), sep=": "),
    "INSTRUMENTTYPE: GC-HRMS",
    "INSTRUMENT: Thermo Exploris240",
    "IONMODE: Positive",
    "IONIZATION: EI",
    "Spectrum_type: in-source",
    paste("Notes: Date processed", Sys.Date(), sep= " "),
    paste("COMMENT: ", 
          "Study ID: ", study_id, "; ",
          "Normalized Spectral Entropy: ", round(ms_entropy,3), sep=""),
    paste("Num Peaks: ", nrow(ms_spectra), sep=""),

    sep = "\n", file = msp_output, append = TRUE)

    #Add spectra
    fwrite(x =ms_spectra,
           file = msp_output,
           sep="\t",
           col.names=F,
           append=T)

    #Add empty line after spectra
    cat("",
          sep = "\n", file = msp_output, append = TRUE)

Bad_MSP.txt Good_MSP.txt

jorainer commented 7 months ago

Thanks! I'll have a look into this

jorainer commented 7 months ago

Note: I've added additional checks and more meaningful error messages in the PR #15 . That pull request also correctly imports the MS level (if provided in the MSP file).

jorainer commented 7 months ago

@sneumann , would be nice if you could give a quick look (+review) on the PR :)

jorainer commented 7 months ago

Feedback also on your test files @CLUES-Emory :

With the new version we can read the good MSP (also importing the MS level correctly):

> a <- readMsp("Good_MSP.txt")
> a
DataFrame with 2 rows and 15 columns
         name   msLevel     rtime RETENTIONINDEX INSTRUMENTTYPE
  <character> <integer> <numeric>    <character>    <character>
1      C00001         1   3574.79        3574.79        GC-HRMS
2      C00002         1   3397.41        3397.41        GC-HRMS
          instrument  polarity  IONIZATION Spectrum_type                  Notes
         <character> <integer> <character>   <character>            <character>
1 Thermo Exploris240         1          EI     in-source Date processed 2024-..
2 Thermo Exploris240         1          EI     in-source Date processed 2024-..
                 COMMENT   Num.Peaks                             mz
             <character> <character>                  <NumericList>
1 Study ID: HRE0041; N..          59  59.5896, 65.8189,107.0492,...
2 Study ID: HRE0041; N..          12    189.164,191.143,219.094,...
           intensity   dataOrigin
       <NumericList>  <character>
1 1.09,1.12,8.48,... Good_MSP.txt
2 2.54,3.48,1.03,... Good_MSP.txt

While with the bad one you get now a eventually better error message:

> a <- readMsp("Bad_MSP.txt")
Error: MSP format error. Make sure that data for multiple spectra are separated by an empty new line and that general spectrum metadata/information is provided in 'name: value' format (i.e. name and value of the metadata field separated by a ":").