Closed michaelwitting closed 4 years ago
@michaelwitting thanks for reporting this bug!
MsBackendMgf
inherits this method from MsBackend
. A MWE:
d <- DataFrame(msLevel = 1L, scanIndex = 1L)
d$mz <- list(c(1:3))
d$intensity <- list(c(4:6))
r <- Spectra(d)
mz(dropNaSpectraVariables(r))
spectraVariables,Spectra
removes mz
and intensity
:
spectraVariables(r@backend)
# [1] "msLevel" "rtime"
# [3] "acquisitionNum" "scanIndex"
# [5] "mz" "intensity"
# [7] "dataStorage" "dataOrigin"
# [9] "centroided" "smoothed"
# [11] "polarity" "precScanNum"
# [13] "precursorMz" "precursorIntensity"
# [15] "precursorCharge" "collisionEnergy"
# [17] "isolationWindowLowerMz" "isolationWindowTargetMz"
# [19] "isolationWindowUpperMz"
spectraVariables(r)
# [1] "msLevel" "rtime"
# [3] "acquisitionNum" "scanIndex"
# [5] "dataStorage" "dataOrigin"
# [7] "centroided" "smoothed"
# [9] "polarity" "precScanNum"
# [11] "precursorMz" "precursorIntensity"
# [13] "precursorCharge" "collisionEnergy"
# [15] "isolationWindowLowerMz" "isolationWindowTargetMz"
# [17] "isolationWindowUpperMz"
So many methods that are relying on spectraVariables
could fail/be affected (spectraData
, dropNaSpectraVariables
, ...).
@jorainer Is there any reason why spectraVariables,MsBackendDataFrame
is different from spectraVariables,Spectra
?
Is there any reason why spectraVariables,MsBackendDataFrame is different from spectraVariables,Spectra?
spectraVariables,Spectra
returns all spectra variables except "mz"
and "intensity"
while the same for MsBackend
should return all of them. The reason: users will use the Spectra
object and might/will call spectraData,Spectra
to get multiple columns at once - but we might not want to return them a very heavy DataFrame
that contains also the m/z and intensity values. It would be still possible, by default they are however not returned. The spectraData,MsBackend
on the other hand should (by default) return all spectra variables. This was a design decision @lgatto and I made some time ago.
Possible solutions:
1) leave it as it is.
2) spectraVariables,Spectra
and spectraVariables,MsBackend
return all variables except m/z and intensity and hence spectraData
on Spectra
and MsBackend
would do the same. We would then have to ensure that methods like setBackend
do in fact get spectra data including m/z and intensity values from the backend (and I don't know in what other places we have to fix this then too).
Open to any alternative suggestion to make that cleaner/clearer.
The MRE above looks like a bug to me. From the documentation:
• ‘dropNaSpectraVariables’: removes spectra variables (i.e. columns in the object's ‘spectraData’ that contain only missing values (‘NA’). Note that while columns with only ‘NA’s are removed, a ‘spectraData’ call after ‘dropNaSpectraVariables’ might still show columns containing ‘NA’ values for _core_ spectra variables.
According to this, non-core columns that contain only NA values are dropped. mz
(irrespective of its core status, that might or might not apply depending on whether we consider the Spectra
object itself or its backend) isn't NA only and shouldn't be dropped.
What am I missing?
Yes, that the function drops mz
or intensity
is definitely a bug I will fix.
The other issue is that spectraVariables,Spectra
and spectraVariables,MsBackend
return different variables, which is indeed confusing. I will also look into that and try to fix.
The other issue is that
spectraVariables,Spectra
andspectraVariables,MsBackend
return different variables, which is indeed confusing. I will also look into that and try to fix.
It might be confusing from the outside, but I think it makes sense knowing that these object handle different parts of the data. I don't think this is a problem at all - it actually highlights the difference, which is an important design choice.
OK, but then I will highlight this even more in the documentation of the MsBackend
.
It's fixed in 0.99.7 (also pushed to BioC, so should be available there also within the next few days). Thanks @michaelwitting for reporting!
Thanks!
Thanks!
Dear RforMassSpectrometry-Team,
when reading MS2 spectra from .mgf or Massbank files the
dropNaSpectraVariables
removes values inmz
andintensity
. Any ideas?Best regards,
Michael
Example file attached
prx-5_1_1-A,4_01_35913.zip