greenelab / core-accessory-interactome

Investigating the functional relationship between P. aeruginosa core and accessory genes.
BSD 3-Clause "New" or "Revised" License
1 stars 1 forks source link

Update threshold for compendia and reprocess #26

Closed ajlee21 closed 3 years ago

ajlee21 commented 3 years ago

This PR is updating the threshold used to partition Pseudomonas gene expression data into PAO1 and PA14 compendia.

This threshold is based on the distribution of median accessory gene expression (i.e. we expect PAO1 samples to have high PAO1 median accessory gene expression and 0 PA14 median accessory expression). To determine the threshold we will look at the distribution of median accessory gene expression in samples SRA labeled as PAO1 vs non-PAO1.

The main new change is found in 0_decide_threshold.ipynb that explores where to place the threshold to separate the PAO1 (grey) vs non-PAO1 (blue) samples.

image

For separating between PA14 vs non-PA14 samples: image

More details can be found in the readme