Open kozo2 opened 1 year ago
Sorry, somehow I overlooked this issue. Could you eventually provide a (minimal) example msp where this is not working? I've observed similar things for msp that provide, in addition to m/z and intensity values also annotations for peaks.
Hi, also came across this error reading .MSP files I had created. The cause of the error was one of the fields I added not having a colon separator, e.g. my file had the field "INSTRUMENT Thermo Exploris240". Once I updated the field to "INSTRUMENT: Thermo Exploris240" I was able to read the file with no issues.
Hi @CLUES-Emory and @kozo2, thanks for reporting ! Could you send a (minified) broken example MSP ? At the minimum, the MsBackendMSP should throw a more informative error message. Also, can you report which software (and software version) created the broken MSP files ? Yours, Steffen
I'm sorry, I forgot that I created this issue and ended up losing the msp 🙇 I would appreciate it if @CLUES-Emory could share the data with us 🙏
Hi @sneumann and @kozo2, See attached for a working and non-working MSP file (I had to change the file extension to .txt so Github would allow the upload, but you can change it back to .msp). The error was due to the lack of a colon after INSTRUMENT. Once I corrected this, the file was able to load with no issues. However, there still are a few quirks I've noticed. For example, the MSBackendMSP always seems to set the MSLevel to 2, even though the MSP field is for MSLevel 1 (this is GC data).
The MSP file itself was made using code I created, copying and pasting below for reference. Thanks!
#Write msp file
cat(
paste("NAME", clust_ii, sep=": "),
"msLevel: 1",
paste("RETENTIONTIME", round(as.numeric(rc_cluster_ft[ii,2]),2), sep=": "),
paste("RETENTIONINDEX", round(as.numeric(rc_cluster_ft[ii,2]),2), sep=": "),
"INSTRUMENTTYPE: GC-HRMS",
"INSTRUMENT: Thermo Exploris240",
"IONMODE: Positive",
"IONIZATION: EI",
"Spectrum_type: in-source",
paste("Notes: Date processed", Sys.Date(), sep= " "),
paste("COMMENT: ",
"Study ID: ", study_id, "; ",
"Normalized Spectral Entropy: ", round(ms_entropy,3), sep=""),
paste("Num Peaks: ", nrow(ms_spectra), sep=""),
sep = "\n", file = msp_output, append = TRUE)
#Add spectra
fwrite(x =ms_spectra,
file = msp_output,
sep="\t",
col.names=F,
append=T)
#Add empty line after spectra
cat("",
sep = "\n", file = msp_output, append = TRUE)
Thanks! I'll have a look into this
Note: I've added additional checks and more meaningful error messages in the PR #15 . That pull request also correctly imports the MS level (if provided in the MSP file).
@sneumann , would be nice if you could give a quick look (+review) on the PR :)
Feedback also on your test files @CLUES-Emory :
With the new version we can read the good MSP (also importing the MS level correctly):
> a <- readMsp("Good_MSP.txt")
> a
DataFrame with 2 rows and 15 columns
name msLevel rtime RETENTIONINDEX INSTRUMENTTYPE
<character> <integer> <numeric> <character> <character>
1 C00001 1 3574.79 3574.79 GC-HRMS
2 C00002 1 3397.41 3397.41 GC-HRMS
instrument polarity IONIZATION Spectrum_type Notes
<character> <integer> <character> <character> <character>
1 Thermo Exploris240 1 EI in-source Date processed 2024-..
2 Thermo Exploris240 1 EI in-source Date processed 2024-..
COMMENT Num.Peaks mz
<character> <character> <NumericList>
1 Study ID: HRE0041; N.. 59 59.5896, 65.8189,107.0492,...
2 Study ID: HRE0041; N.. 12 189.164,191.143,219.094,...
intensity dataOrigin
<NumericList> <character>
1 1.09,1.12,8.48,... Good_MSP.txt
2 2.54,3.48,1.03,... Good_MSP.txt
While with the bad one you get now a eventually better error message:
> a <- readMsp("Bad_MSP.txt")
Error: MSP format error. Make sure that data for multiple spectra are separated by an empty new line and that general spectrum metadata/information is provided in 'name: value' format (i.e. name and value of the metadata field separated by a ":").
I tried to read my own msp file as follows and got the following error.
Let me know if there is anything I should check.