mpc-bioinformatics / McQuaC

Transform the Quality Control workflow from Knime into a workflow in Nextflow
Other
2 stars 0 forks source link

Add PPM Plot #18

Closed KarinSchork closed 2 months ago

KarinSchork commented 1 year ago

Add PPM Plot to maybe see a shift.

Luxxii commented 1 year ago

This would be then based on the PSM-Level. We would get for each RAW-file a histogram of what mass was measured and what mass (or peptide) has been assigned to it

Luxxii commented 1 year ago

Here is a simple script calculating the ppms for each entry

    with open(input_csv, "r") as inf:
        inf_csv = csv.reader(inf, delimiter="\t")

        # Get header
        headers = next(inf_csv)
        exp_mass_idx = headers.index("exp_mass")  # As in Comet
        clc_mass_idx = headers.index("calc_mass") # As in Comet

        ppms = []

        for l in inf_csv:
            clc_mass = float(l[clc_mass_idx])
            exp_mass = float(l[exp_mass_idx])
            # consider Proton masses, which may need to be removed since it is not encoded in the theoretical mass
            prot_mass_delta = [None]*5
            for i in range(0,5):  # We consider for up to 5 proton masses and select the best fitting one
                prot_mass_delta[i] = abs(exp_mass - (clc_mass + i*HYDROGEN_MONO_MASS))
            exp_mass = exp_mass - np.argmin(prot_mass_delta)*HYDROGEN_MONO_MASS

            ppms.append(
                (exp_mass - clc_mass) * (1000000 / clc_mass)
            )
Luxxii commented 1 year ago

image This is an example-Histogram Plot on actual data.

KarinSchork commented 1 month ago

@SvitiMPC suggested to plot boxplots or better violinplots, so you can directly compare the samples