Potential useful information to review and add to the readthedocs page - from learning unit 4

Quantitative Behavior of Mass Spectrometers To apply the theory we learnt in the last chapters of this unit to quantitative analyses with mass spectrometers, we need to think about the limitations that come from the physical processes behind this method first. The parts crucial for quantification are the separation, ionization and the detection. After eluting from the column, the analyte is ionized so that the number of ions for this analyte should be proportional to the concentration of it in the input sample. At the detector, a signal (ion current) is measured that is proportional to the number of ions arriving at it. However, this process has some limitations that make quantification a hard task: Saturation: The detector has an upper limit for an ion current to be reported. Too many ions hitting the detector at the same time result in a saturated signal. This limits the linear/dynamic range of this method. Ionization efficiency: Different species are ionized more easily than others. That means, that the response factors are different among species and signal intensities for different species can not be compared absolutely without any correction. Ionization efficiency of a molecule in ESI for example, depends on factors like (non-)polarity and surface activity of the molecule. Noise: The matrix competes with the analyte for ionization. We can not be sure that the signal we are measuring comes from the analyte. Another problem is that the signal for an analyte will be "split" and recorded at different retention times and different m/z values: Elution profiles: Not all of the analyte comes out of the column at the same time. Due to peak broadening, we will record bell shaped elution profiles instead of sharp peaks. Isotope profiles: Because of different isotopic compositions, the analyte actually occurs as several peaks at different m/z values (so called isotope profiles or ladders). Charge states: During ionization, some analyte molecules might be charged differently than others. Their isotope ladders will occur at proportions of the actual mass on the m/z axis. That means, we have to sum the signals of different regions on an LCMS map.

Deisotoping Along the m/z axis, there is a similar problem as on the last page. Due to charge states and isotopes in the composition of the analytes, we will not observe a single peak in the MS spectrum. However, we can infer the charge, the mass, the average isotope composition and eventually the isotope profile step by step in the following way: The distance between neighbouring (non-noise) peaks corresponds to one atom substituted with its heavier isotope. This results in an increased mass by one neutron, represented as a distance of roughly 1/z on the m/z axis. So if we observe distances of around 0.33 Thompson between neighbouring isotopic peaks, the charge of this measured ion was most likely 3. After the charge was determined, the mass can easily be calculated by multiplying the observed m/z values with the charge. When the mass is known, we can calculate the average isotope composition of an average amino acid with this mass and model the (still discretized) isotope pattern using a binomial distribution. If we now convolve these peaks with Gaussians (whose standard deviations depend on the resolution of the instrument), we will obtain the expected continuous isotope pattern and can select the relevant peaks that we have to sum to get the overall intensity for this particular species. After finding an isotope profile, one has to keep in mind that other isotope profiles might correspond to the same species at different m/z values due to different ionization. However, because of a relationship between intensities and the charge of an ion on some devices, one has to be careful when combining them for absolute quantification. The last pages have shown, that we have to look in both retention time and m/z dimension for traces of intensities for a particular species. These two-dimensional patterns are called features. The use and a more detailed explanation of how to find and evaluate features, will be given in learning units 5A and 5B.

Quantitative Data – MS1 Spectra The way of using MS1 spectra for quantitative proteomics is simply to load a peptide sample onto the LC column coupled to an MS instrument. For simplicity, assume that every sample (a patient, a given experimental condition, a time point in a time series, etc.) is run as only one LC-MS experiment, thus no pre-analysis separation is performed. In MS1 Spectra, different ionized species in the same spectrum result in different peaks. The mass of a peptide (peak) is usually found during several consecutive MS scans, depending on how much time it takes the analyte (peptides) to elute from the column (corresponding to the width of its chromatographic peak).

Comparing intensities of different analytes in the same spectrum is not possible because they have different response factors. Peptides/metabolites that differ only by a stable isotope label will have identical response factors – their intensities can be compared within the same spectrum. This is the basis for isotopic labels.

Quantitative Data – LC-MS Maps Numerous spectra are acquired with rates up to dozens per second over the course of an LC-MS run. Ideally, it should be easy to identify corresponding spectra from different (sub)samples based on their retention time. However, the exact retention time of an analyte (and thus the occurrence of its chromatographic peak) may shift from run to run. An analyte can also occur in several spectra from the same sub-sample as outlined above. The relative simplicity of comparing only one spectrum from each sample is therefore lost in this approach. A useful way to visualize a quantitative LC-MS experiment is to stack the spectra, yielding maps.

DDA To produce tandem mass spectra two common modes are established: the data dependent acquisition and the data independent acquisition In DDA, from one survey scan several ion species (i.e. m/z values) are picked (most commonly the top abundant ones) and further analyzed. For some time after the survey scan, these ions are selectively collected and subjected each to the collision chamber and the product ions analyzed.

With DDA, a broad coverage on MS2 can be accomplished, though the variability may be high. The number of ion species picked also coins the commonly used name for the setting, generalized with the number n: Top-n acquisition. The higher n, the more fragmentations have to be conducted and time consumed, in which newly eluting analytes might get missed. The lower n, the more low abundance ions might get unfragmented.

DIA

In DIA, the tandem spectra from the complete mass range of the analyte is collected. In practice, the system is set to sequentially isolate and fragment subsequent mass windows of certain width (say 10 Th). The overlapping of fragment spectra and the unknown precursor mass of the fragments pose a nontrivial challenge to data analysis for identification.

Suppose we have a set of n samples, each containing a set of molecular components. The majority of the components are the same in each sample, and in our context the components are mainly peptides, but in some cases they are proteins. In quantitative proteomics, the task is to explore how the abundance of the corresponding peptides varies from sample to sample. Due to the way the instruments detect the ions, relative quantification is the easiest form of quantification, meaning that, instead of the absolute concentration, we measure fold changes in the molecules between samples. The relative measurements are performed either within a sample or across the samples, and both labeled methods and label-free methods exist.

We will divide the discussion into methods related to label-free peptide methods and label-based peptide methods. In this context, a label is simply something attached to the peptides of a sample to enable the distinction of this sample from a differently labeled or unlabeled sample. The labelling technologies can be naturally grouped into in vivo labelling and in vitro labelling. Isobaric labelling strategy such as iTRAQ and TMT, will be introduced in LU5C in details.

Label-free quantification is a method that aims to determine the relative amount of proteins in two or more biological samples. It may be based on precursor signal intensity or on spectral counting. The first method is useful when applied to high precision mass spectra. In contrast, spectral counting simply counts the number of spectra identified for a given peptide in different biological samples and then integrates the results for all measured peptides of the protein(s) that are quantified. The computational framework includes detecting peptides, matching the corresponding peptides across multiple maps, selecting discriminatory peptides. MS1 or MS2? Using label-free MS methods to quantify peptide samples will not give any indication of the identity of the components under analysis, but has a greater potential for discovering low-abundance molecules compared to MS/MS-based methods, because the instrument does not need to spend time in MS/MS mode. If interesting candidates are discovered, these can be selected for subsequent identification by MS/MS if suitable spectrometers are used.

Labeling techniques The idea of labelling techniques is to introduce a label in one sample and a different (or no label) in another. The mixing of labeled samples allows a relative quantification between two (or more) samples. Many labeling techniques exploit stable isotope labeling. Different isotopes of the same element behave chemically basically identically (Following isotopes are often used: .1/2H,.12/13C,14/15N,16/18O ). Their masses differ, however, so the MS can distinguish them. Advantages Both samples are treated identically, systematic errors affect them in the same way. It can be easily annotated manually (e.g., by looking for pairs of peaks). Disadvantages Labels can be expensive, difficult, unreliable to introduce. Labeling in vivo is not always possible, not all techniques support in vitro labeling. Chemical labeling and Metabolic labeling 1, Chemical labeling means that peptides are modified chemically after extraction. The label is usually attached covalently at specific functional groups (e.g., N-terminus, specific side chains). It does not involve a perturbation of the in vivo system. Labeling occurs late (during sample preparation) and thus does not account for variance introduced in the early steps. e.g. iTRAQ, TMT 2, Stable isotope labels can also be integrated by ‘feeding’ the organism with labeled metabolites, e.g., amino acids, nitrogen sources, glucose. Full incorporation of the label can take a while. It requires perturbation of the in vivo system, depending on the size. It's quite expensive. Labeling occurs early in the study, results in higher reproducibility.

Applications Quantitative proteome analysis, the global analysis of protein expression, is increasingly being used as a method to study steady-state and perturbation-induced changes in protein profiles. It helps to better understand the structure, function, and control of biologic systems and processes. Applications to systems biology

"Quantitative proteomics can be successfully used for characterizing alterations in protein abundance, finding novel protein-protein and protein-peptide interactions. Further, it can directly compare activation of entire signaling networks in response to individual stimuli and discover critical differences in their circuits that account for alterations of cell response." Aebersold, Ruedi, Beate Rist, and Steven P. Gygi. "Quantitative proteome analysis: methods and applications." Annals of the New York Academy of Sciences 919.1 (2000): 33-47.

SWATH (Sequential Windowed data independent Acquisition of the Total High-resolution Mass Spectra) acquisition was not listed in the graph above. It is a global quantitative strategy that is usually compared with SRM/MRM mentioned on previous page. The idea is to collect a MS and MS/MS spectrum at high resolution on every analyte for the quantitation of everything in the sample. Unlike SRM that each MS2 series is a record of one peptide across LC, the MS scan data of SWATH acquisition is independent and complete fragment ion map of sample is recorded by cycle.

A special variety of DIA is SWATH (Sequential window acquisition of all theoretical mass spectra). It was first introduced in use with a Triple-TOF system. Here, the quadrupole isolates sequentially 25 Th precursor windows across a mass range of interest during the complete elution time of the coupled LC. The ions in these windows are fragmented and analysed.

iTRAQ Isobaric tags for relative and absolute quantitation (iTRAQ) is a very commonly used isobaric labeling method in quantitative proteomics. It uses stable isotope labeled molecules that can be covalent bonded to the N-terminus and side chain amines of proteins. Based on covalent modification of N-terminus of peptides Labeling performed after digestion (also applicable to clinical samples) Kits available for 4 or 8 distinct labels (‘quadroplex’, ‘octoplex’)

Here is an example illustrating how the iTRAQ strategy works. A, Isobaric tagging chemistry. The complete molecule consists of a reporter group (based on N-methylpiperazine), a massbalance group (carbonyl), and a peptide-reactive group (NHS ester). The overall mass of reporter and balance components are kept constant using 13C, 15N, and 18O atoms (B). The number and position of enriched centers in the ring has no effect on chromatographic or MS behavior. B, The reporter group ranges in mass from m/z 114.1 to 117.1, while the balance group ranges in mass from 28 to 31 Da, such that the combined mass remains constant (145.1 Da) for each of the four reagents. Following fragmentation of the tag amide bond, however, the balance (carbonyl) moiety is lost (neutral loss), while charge is retained by the reporter group fragment. The numbers in parentheses indicate the number of enriched centers in each section of the molecule. C, A mixture of four identical peptides appears as a single, unresolved precursor ion in MS (identical m/z). Following CID, the four reporter group ions appear as distinct masses (114–117 Da). All other sequence-informative fragment ions (b-, y-, etc.) remain isobaric, and their individual ion current signals (signal intensities) are additive.

Example: MS/MS spectrum of peptide TPHPALTEAK prepared by labeling four separate digests with each of the four isobaric reagents and combining the reaction mixtures in a 1:1:1:1 ratio. (i) isotopic distribution of the precursor ([M+H]+, m/z 1352.84) and the tandem spectrum. (ii) low mass region showing the signature ions used for quantitation, (iii) isotopic distribution of the b6 fragment ion, (iv) isotopic distribution of the y7 fragment ion. The peptide is labeled by isobaric tags at both the N terminus and C-terminal lysine side chain. The precursor ion and all the internal fragment ions (e.g. type b- and y-) therefore contain all four members of the tag set, but remain isobaric.

Correction and Normalization iTRAQ reagents ususally contain isotopic impurities. The intensity of each reporter ion peak will influence the intensities (areas) of adjacent peaks (+/- 2 nominal masses). To solve this, correction factors can be determined for each of the reporter ions by mass spectrometry of the individual reagents. The following table lists the percentage of nominal mass shifts in iTRAQ.

iTRAQ and protein quantification Peptide quantification does not imply protein quantification. Different isoforms of a protein imply that it is not trivial to translate peptide quantities to protein quantities. Peptides can only be mapped to so-called protein groups, a set of proteins containing this protein. For iTRAQ analysis that means that some peptides do not yield information that can be used about different isoforms. Regression methods can be used to unravel some of this information. We will get back to this problem later when we will be discussing protein inference in lecture 9.

TMT Besides iTRAQ that is introduced on previous page, Tandem mass tags (TMT or TMTs) are also one of the chemical labelling methods using isobaric tags. The tags contain four regions: a mass reporter region (M), a cleavable linker region (F), a mass normalization region (N) and a protein reactive group (R). The chemical structures of all the tags are identical but each contains isotopes substituted at various positions, such that the mass reporter and mass normalization regions have different molecular masses in each tag. The combined M-F-N-R regions of the tags have the same total molecular weights and structure so that during chromatographic or electrophoretic separation and in single MS mode, molecules labelled with different tags are indistinguishable. Upon fragmentation in MS/MS mode, sequence information is obtained from fragmentation of the peptide back bone and quantification data are simultaneously obtained from fragmentation of the tags, giving rise to mass reporter ions.

The structures of TMT tags are publicly available through the unimod database at unimod.org and hence, mass spectrometry software such as Mascot are able to account for the tag masses. Thompson A, Schäfer J, Kuhn K et al.: Tandem mass tags: a novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. In: Anal. Chem.. 75, Nr. 8, 2003, S. 1895–904.

OpenMS / pyopenms-docs

Potential useful information to review and add to the readthedocs page - from learning unit 4 #367