SATAY-LL / LaanLab-SATAY-DataAnalysis

This contains codes and workflows for data analysis regarding SATAY experiments.
Apache License 2.0
4 stars 3 forks source link

Using regression to estimate the probabilities for each gene to be essential or not given the SATAY data #20

Open leilaicruz opened 4 years ago

leilaicruz commented 4 years ago

See HERE the web visualization of the code :-)

leilaicruz commented 4 years ago

Go HERE to see the details of the python program.

If we plot the reads and insertions per gene and highlight if they are essential or not from published data , we see this 👇 image

Since both datasets sort of overlap (after truncating the datasets and removing outliers) the regression model can not predict essential genes with more than 0.5 probability . image

However, if we go deep into the probabilites we can see that if the probability of being essential is bigger than 0.3 already 76% of all essential genes fall inside it . image