medbioinf / pia

:books: :microscope: PIA - Protein Inference Algorithms
https://github.com/medbioinf/pia
Other
21 stars 9 forks source link

Data processing with MSI dataset #165

Closed xiweifan closed 10 months ago

xiweifan commented 2 years ago

Hi Julian,

I am wondering if PIA can adapt imaging MS data like imzML. This format is used for new bruker MALDI-TOF and it seems there is no way to transfer the imzML file into other format that can be used for PIA?

Reference: https://ms-imaging.org/wp/

Thanks, Xiwei

julianu commented 2 years ago

Hej Xiwei,

The actual source of the identification data for PIA does not really matter, but if you cannot transform it into mzIdentML, could you maybe inform me on which format is usually used?

Anyway, to be honest, I am not aware of an imaging workflow that would need a PIA processing? If you perform a separate LC-MS/MS run for identification, PIA obviously should work. But are there also ways in which you identify the proteins in one/all spatial "pixels" of the IMS experiment?

Best, Julian

xiweifan commented 2 years ago

Hi Julian,

Thank you for your timely reply! The normal data format for MSI is the imzML file. The imzML file mainly consists of two files: .imzML and .ibd. This is the reference paper for the introduction of the file format: https://link.springer.com/protocol/10.1007%2F978-1-60761-987-1_12.

And actually, there is a huge potential for the PIA used in MSI data, except for generating images with ambiguous meaning. For instance, the spectra generated by region of interest can be used for PIA, which tells us the differential protein expression in certain areas. Actually, for every given area, the Bruker SCiLS lab will generate an overall spectrum. However, the only export option for that software is imzML and CSV. The typical CSV file exported only contains "m/z;intensities;variances;skyline".

Thanks, Xiwei

julianu commented 2 years ago

Hej Xiwei,

Yes, I know the imzML format, but these are all spectrum-only formats if I am not mistaken. PIA on the other hands needs peptide (respectively spectrum) identifications to perform the inference, as results of a peptide search engine, which are usually stored in mzId or any vendor format (Mascot dat, ProteomeDiscoverer MSF etc.). Do you have any of these for imaging data?

Best, Julian

xiweifan commented 2 years ago

Hi Julian,

I have no idea but a peak list with intensity is definitely available. I assume that is enough for PSM using the decoy database?

Thanks, Xiwei

julianu commented 2 years ago

Hej Xiwei,

If these spectra are MS/MS spectra (on the UTX you needed the "LIFT" for this) you could identify them. But, as you have no prior separation of peptides nor proteins, you will get a mixture of all peptide fragments in one "pixel". These you might get identified like DIA data, but as you have no chromatography at all, I guess it might get quite tricky. But as I say, this is just my current knowledge and I am not very accustomed to IMS data. If you get a file containing PSMs from MSI data though, PIA should be able (or at least adaptable) to manage these.

Best, Julian

julianu commented 10 months ago

(closed due to inactivity)