lgatto / MSnbase

Base Classes and Functions for Mass Spectrometry and Proteomics
http://lgatto.github.io/MSnbase/
123 stars 50 forks source link

write Spectra object to mgf #508

Closed ricoderks closed 3 years ago

ricoderks commented 4 years ago

Hi,

I would like to write all spectra from a Spectra object to an mgf file. At the moment as far as I can see this is not possible without looping/applying over all spectra, but then COM= is written in front of every spectum. I found a solution how to adjust the writeMgfData function. Is this something you are interested in?

I was not able to build the package so I didn't not run any tests only some manual tests.

Cheers, Rico

lgatto commented 4 years ago

I assume you are referring to an Spectra object from the MSnbase package. The easiest way would be to convert it to an MSnExp with as(, "MSnExp") and then use the writeMgfData() method.

ricoderks commented 4 years ago

Aaah, perfect! Didn't now this was possible!! Forgot to mention the Spectra object is from xcms. I'am trying to analyse SWATH data with xcms.

I just tried it, but I get the error:

Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘msLevel’ for signature ‘"NULL"’

Cheers, Rico

lgatto commented 4 years ago

xcms doesn't define any Spectra as far as I know, but uses the classes from MSnbase. Looking back, writeMgfData() should work with MSnbase Spectra objects:

> showMethods("writeMgfData")
Function: writeMgfData (package MSnbase)
object="MSnExp"
object="Spectra"
object="Spectrum"

(There's even an example in ?Spectra) What do you get when you ask for the class of that object?

> class(spl)
[1] "Spectra"
attr(,"package")
[1] "MSnbase"

Could you show what you get when displaying it in the console.

Maybe @jorainer knows if there's something in xcms that alters the instances.

ricoderks commented 4 years ago

With reconstructChromPeakSpectra the reconstructed MSMS spectra are made and I store it in swath_spectra.

swath_spectrum contains only a few reconstructed MSMS spectra from swath_spectra

> class(swath_spectrum)
[1] "Spectra"
attr(,"package")
[1] "MSnbase"
> swath_spectrum
Spectra with 7 spectra and 3 metadata column(s):
      msLevel     rtime peaksCount |                    ms2_peak_id                                              ms2_peak_cor     peak_id
    <integer> <numeric>  <integer> |                <CharacterList>                                             <NumericList> <character>
  1         2        NA          2 |              CP093961,CP093945                       0.959615235223452,0.940369116669687     CP03663
  1         2        NA          6 | CP095637,CP095613,CP095610,...                    0.95030569539458,0.9580272411544,1,...     CP09577
  1         2        NA          0 |                                                                                              CP14994
  1         2        NA          4 | CP098366,CP098372,CP098364,... 0.907055438339398,0.944695411074107,0.926726019673679,...     CP18390
  1         2        NA          2 |              CP099626,CP099629                       0.931409449088518,0.933731054268436     CP22749
  1         2        NA          6 | CP100947,CP100938,CP100932,... 0.965010642319786,0.972252507222655,0.934658151247491,...     CP27154
  1         2        NA          6 | CP101966,CP101965,CP101970,... 0.947087320548922,0.943259179989555,0.915003382163285,...     CP30683
lgatto commented 4 years ago

Weird - what's the output of traceback() after the error. Could you share the object?

ricoderks commented 4 years ago
> test <- as(swath_spectrum, "MSnExp")
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘msLevel’ for signature ‘"NULL"’
> traceback()
25: stop(gettextf("unable to find an inherited method for function %s for signature %s", 
        sQuote(fdef@generic), sQuote(cnames)), domain = NA)
24: (function (classes, fdef, mtable) 
    {
        methods <- .findInheritedMethods(classes, fdef, mtable)
        if (length(methods) == 1L) 
            return(methods[[1L]])
        else if (length(methods) == 0L) {
            cnames <- paste0("\"", vapply(classes, as.character, 
                ""), "\"", collapse = ", ")
            stop(gettextf("unable to find an inherited method for function %s for signature %s", 
                sQuote(fdef@generic), sQuote(cnames)), domain = NA)
        }
        else stop("Internal error in finding inherited methods; didn't return a unique method", 
            domain = NA)
    })(list("NULL"), new("standardGeneric", .Data = function (object, 
        ...) 
    standardGeneric("msLevel"), generic = "msLevel", package = "ProtGenerics", 
        group = list(), valueClass = character(0), signature = "object", 
        default = NULL, skeleton = (function (object, ...) 
        stop("invalid call in method dispatch to 'msLevel' (no default method)", 
            domain = NA))(object, ...)), <environment>)
23: FUN(X[[i]], ...)
22: lapply(X = X, FUN = FUN, ...)
21: sapply(spectra(object), msLevel)
20: sapply(spectra(object), msLevel)
19: .local(object, ...)
18: msLevel(object)
17: msLevel(object)
16: validityMethod(as(object, superClass))
15: isTRUE(x)
14: anyStrings(validityMethod(as(object, superClass)))
13: validObject(.Object)
12: .nextMethod(.Object, ...)
11: eval(call, callEnv)
10: eval(call, callEnv)
9: callNextMethod(.Object, ...)
8: .local(.Object, ...)
7: .nextMethod(.Object = .Object, ... = ...)
6: callNextMethod()
5: initialize(value, ...)
4: initialize(value, ...)
3: new("MSnExp", assayData = assaydata, phenoData = new("AnnotatedDataFrame", 
       pd), featureData = fd, processingData = process)
2: asMethod(object)
1: as(swath_spectrum, "MSnExp")
lgatto commented 4 years ago

I get the following error

Error: invalid version specification ‘c(0, 4, 0)’, ‘c(0, 2, 0)’

I think you should be able to attach files to an issue, otherwise feel free to email me the serialised (rda or rds) object.

ricoderks commented 4 years ago

You can find the object here.

lgatto commented 4 years ago

The error comes from some of your extra mcols that can't be converted to simple vectors (they are Lists). The easiest way is to remove them:

> load("~/Downloads/swath_spectrum.RData")
> swath_spectrum
Spectra with 7 spectra and 3 metadata column(s):
      msLevel     rtime peaksCount |                    ms2_peak_id
    <integer> <numeric>  <integer> |                <CharacterList>
  1         2        NA          2 |              CP093961,CP093945
  1         2        NA          6 | CP095637,CP095613,CP095610,...
  1         2        NA          0 |                               
  1         2        NA          4 | CP098366,CP098372,CP098364,...
  1         2        NA          2 |              CP099626,CP099629
  1         2        NA          6 | CP100947,CP100938,CP100932,...
  1         2        NA          6 | CP101966,CP101965,CP101970,...
                      ms2_peak_cor     peak_id
                     <NumericList> <character>
  1              0.959615,0.940369     CP03663
  1 0.950306,0.958027,1.000000,...     CP09577
  1                                    CP14994
  1 0.907055,0.944695,0.926726,...     CP18390
  1              0.931409,0.933731     CP22749
  1 0.965011,0.972253,0.934658,...     CP27154
  1 0.947087,0.943259,0.915003,...     CP30683
> writeMgfData(swath_spectrum, "test.mgf")
Error in as.vector(x, mode) : 
  coercing an AtomicList object to an atomic vector is supported only for
  objects with top-level elements of length <= 1
> mcols(swath_spectrum) <- mcols(swath_spectrum)[, 3, drop = FALSE] ## keep only peak_id
> writeMgfData(swath_spectrum, "test.mgf")

Alternatively, if you need these List-type fields in your mgf, you can convert them with

> load("~/Downloads/swath_spectrum.RData")
> mcols(swath_spectrum)[[1]] <- sapply((mcols(swath_spectrum)[[1]]), paste, collapse = ", ")
> mcols(swath_spectrum)[[2]] <- sapply((mcols(swath_spectrum)[[2]]), paste, collapse = ", ")
> swath_spectrum
Spectra with 7 spectra and 3 metadata column(s):
      msLevel     rtime peaksCount |
    <integer> <numeric>  <integer> |
  1         2        NA          2 |
  1         2        NA          6 |
  1         2        NA          0 |
  1         2        NA          4 |
  1         2        NA          2 |
  1         2        NA          6 |
  1         2        NA          6 |
                                                   ms2_peak_id
                                                   <character>
  1                                         CP093961, CP093945
  1 CP095637, CP095613, CP095610, CP095611, CP095612, CP095608
  1                                                           
  1                     CP098366, CP098372, CP098364, CP098365
  1                                         CP099626, CP099629
  1 CP100947, CP100938, CP100932, CP100939, CP100925, CP100941
  1 CP101966, CP101965, CP101970, CP101971, CP101968, CP101962
                                                                                                        ms2_peak_cor
                                                                                                         <character>
  1                                                                             0.959615235223452, 0.940369116669687
  1                    0.95030569539458, 0.9580272411544, 1, 0.961132676502371, 0.918182738619123, 0.919250520096857
  1                                                                                                                 
  1                                       0.907055438339398, 0.944695411074107, 0.926726019673679, 0.918879159424874
  1                                                                             0.931409449088518, 0.933731054268436
  1 0.965010642319786, 0.972252507222655, 0.934658151247491, 0.973961915139738, 0.998127132120032, 0.974085315592034
  1 0.947087320548922, 0.943259179989555, 0.915003382163285, 0.974038251055826, 0.946201740058735, 0.906485244997194
        peak_id
    <character>
  1     CP03663
  1     CP09577
  1     CP14994
  1     CP18390
  1     CP22749
  1     CP27154
  1     CP30683

> writeMgfData(swath_spectrum, "test2.mgf")

This is something that could be fixed in writeMgfData,Spectra, but I don't think it will be possible right away. Pinging @jorainer to let him know.

ricoderks commented 4 years ago

Thanks for your quick reply and help! I was playing around in the code of readWriteMgfData.R and I added this:

setMethod("writeMgfData",
          signature = signature("Spectra"),
          function(object,
                   con = "spectrum.mgf",
                   COM = NULL,
                   TITLE = NULL) {
            writeMgfDataFile(as.list(object), con = con, COM = COM, TITLE = TITLE,
                             verbose = isMSnbaseVerbose())
          })

This made it work, but I'am not sure if this is a nice solution.

lgatto commented 4 years ago

Yes, what you propose ignores the extra fields (including those that generated the error), which is a good solution when these aren't needed.

ricoderks commented 4 years ago

I don't need them. Shall I send you a PR?

lgatto commented 4 years ago

No, thank you, because they might be needed, so we would want this to be optional. Ideally, we would also be able to check which columns can be converted and possibly paste those that can't like I did above.

ricoderks commented 4 years ago

Ok, I'll use your solution to make the mgf's! Thanks again for the help and the great work on the packages!!

lgatto commented 4 years ago

You are welcome, and thank you for your interest.

stanstrup commented 3 years ago

This seem not to be possible with the new Spectra from the Spectra package.

> as(spectras, "MSnExp")

Fejl i as(spectras, "MSnExp") : 
  no method or default for coercing "Spectra" to "MSnExp"