Identify the right CPTAC data for model fitting

meyer-lab / mechanismEncoder

Developing patient-specific phosphoproteomic models using mechanistic autoencoders

4 stars 1 forks source link

Identify the right CPTAC data for model fitting #2

Closed aarmey closed 3 years ago

sgosline commented 4 years ago

Current plan is to use PTRC AML patient data http://synapse.org/ptrc for now, though I am happy to wrangle additional data. There are also two papers that did targeted measurements of EGFR-MAPK signaling abundances and phospho:

1- https://pubmed.ncbi.nlm.nih.gov/27405981/ 2- https://pubmed.ncbi.nlm.nih.gov/29584399/

sgosline commented 4 years ago

Update: we are going to pull all time-course data and do an all vs all (proteins and phosphosite) correlation for Causal Path. First step - create script that pulls all data from Synapse.

FFroehlich commented 4 years ago

@sgosline do you have an example what processed data will look like? Would be helpful to set up the link between mechanistic model simulations and data.

sgosline commented 4 years ago

I haven't pulled the phospho data yet, jjust the bulk, so I can let you know. I gravitate toward tidied data frames. The AML Data looks like this:

FFroehlich commented 4 years ago

Thanks! I think for the autoencoder part it would be great to also have some kind of normalized absolute measure for all samples, not just the fold changes.

sgosline commented 4 years ago

Yeah, that's what these are - the 'log ratio' is the comparison of the tumor sample to the instrument control, so is kind of as absolute as you can get...

FFroehlich commented 4 years ago

Oh I see, was misinterpreting the value then. Is this similar to a bridge sample?

aarmey commented 4 years ago

Exactly. There's a common sample run across all the batches to correct for run-to-run variation.

sgosline commented 4 years ago

I found a tool that pulls data from PDC: https://github.com/PayneLab/cptac I will work on that to do the phospho analysis.