Closed KarinSchork closed 2 months ago
This would be then based on the PSM-Level. We would get for each RAW-file a histogram of what mass was measured and what mass (or peptide) has been assigned to it
Here is a simple script calculating the ppms for each entry
with open(input_csv, "r") as inf:
inf_csv = csv.reader(inf, delimiter="\t")
# Get header
headers = next(inf_csv)
exp_mass_idx = headers.index("exp_mass") # As in Comet
clc_mass_idx = headers.index("calc_mass") # As in Comet
ppms = []
for l in inf_csv:
clc_mass = float(l[clc_mass_idx])
exp_mass = float(l[exp_mass_idx])
# consider Proton masses, which may need to be removed since it is not encoded in the theoretical mass
prot_mass_delta = [None]*5
for i in range(0,5): # We consider for up to 5 proton masses and select the best fitting one
prot_mass_delta[i] = abs(exp_mass - (clc_mass + i*HYDROGEN_MONO_MASS))
exp_mass = exp_mass - np.argmin(prot_mass_delta)*HYDROGEN_MONO_MASS
ppms.append(
(exp_mass - clc_mass) * (1000000 / clc_mass)
)
This is an example-Histogram Plot on actual data.
@SvitiMPC suggested to plot boxplots or better violinplots, so you can directly compare the samples
Add PPM Plot to maybe see a shift.