Closed Jokendo-collab closed 2 years ago
Hi Javan! Firstly I suggest you use either Spectra
or MSnbase
- mixing the two might be tricky as not all functions work the same.
Could you please describe briefly what exactly you want to do?
Hi @jorainer,
I want to separate the identified and unidentified spectra. To give you a little of the background, we know that only ~20% of the MS/MS spectra get Identified when we run sequence search engines such as MaxQuant. So I basically want to extract unidentified spectra using the scan numbers from the evidence file
(MaxQuant output) and scan numbers from the raw files. And from one raw spectra....I need to have spectra containing the scan numbers contained in the evidence file
and the other spectra should contain scan numbers not in the evidence file but present in the original raw file. In another word I want to split a single raw file based on the above information. I hope this sounds good?
I see. So that should be fairly simple with Spectra
: assuming sps
is a Spectra
that you read from a (single!) mzML
file and max_quant_ids
is an integer
with the scan numbers from the MaxQuant output (i.e. the scan numbers from the evidence file*).
## Optionally filter to MS2 spectra only - don't know if that's needed/required in your case
sps <- filterMsLevel(sps, 2L)
sps_ident <- sps[sps$scanIndex %in% max_quant_ids]
sps_noident <- sps[!sps$scanIndex %in% max_quant_ids]
sps_ident
will be all the already identified MS2 spectra and sps_noident
all the not identified MS2 spectra (you lost all MS1 spectra with the filterMsLevel
step above).
You could then also export the spectra to a mzML file
export(sps_noident, file = "not-identified.mzML")
The tricky thing will be to understand what the scan numbers from MaxQuant actually are, if they are the index of the spectrum in the mzML file or a number extracted from the spectrum ID. To explain:
scanIndex
is the index of the scan (spectrum) in the original mzML file. It will be a number from 1 to the total number of spectra.acquisitionNum
is an integer
ID extracted (by the proteowizard code within mzR
) from the spectrum ID in the mzML file. This can be the same number as scanIndex
but does not have to. If the mzML file was e.g. filtered by spectra before the numbers will be different. Best would be if you compare sps$scanIndex
with sps$acquisitionNum
, if they are the same there should be no problem. If they are different you should ensure that you pick the right one (based on what MaxQuant returns as a scan number).
@jorainer this worked well for me. Thanks for detailed response
@jorainer I would like to calculate pairwise similarity between spectra and visualize. I used the following code but it gives an error:
fls = dir(".",pattern = "mzML$",full.names = TRUE)
sps_all = Spectra(fls,backend = MsBackendMzR())
cormat <- compareSpectra(sps, ppm = 20, FUN = ndotproduct)
hm <- pheatmap(cormat, cutree_rows = 3)
When I run compareSpectra' function I get the following error:
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘compareSpectra’ for signature ‘"Spectra", "missing"’`
Could you guide me on this? I would like to get a correlation plot like the one shown here
I guess you get the error because the sps
object/variable is not defined. You load all spectra with sps_all <- ...
but then you call compareSpectra(sps, ...
- so there is some code missing where you filter/reduce your data set from sps_all
to sps
(a possibility could be to focus on only MS2 spectra using sps <- filterMsLevel(sps_all, 2)
, or, even better, to filter based on the precursor m/z you're interested in).
Maybe also have a look at the Spectra tutorials for more/other examples.
I have 72 mzML files and I have done the following:
#Reading the spectral data files <- list.files(".",pattern = '*.mzML', full.names = T) sps <- Spectra(files, backend = MsBackendMzR()) sps
df = spectraData(sps, columns = c("msLevel", "precScanNum", "scanIndex")) #extract the variables of interest
write.table(df, "scanNumbers.txt",sep = '\t')
#write the scan number dataframeddf = read.table("scanNumbers.txt",header = T,sep = '\t')
using subset function to extract the scannumbers associated with MS1 and MS2
ms2 <- subset(ddf, msLevel == 2)
write.table(ms2, "msmsScanumbers.txt",sep = '\t')
ms1 <- subset(ddf, msLevel == 1)
write.table(ms1, "msScanumbers.txt",sep = '\t')
ms1 = read.table("msScanumbers.txt", header=T, sep='\t')
create MSnbase object for filtering
msnexp <- readMSData(files)
msnexp
filtering filterPrecursorScan
msms = filterPrecursorScan(object="msnexp", acquisitionNum = ms1$scannumber)
I now want to export the individual spectra in mzML file and following the tutorial, I have not been able to do that. Could you help in this regard?