Open leilaicruz opened 4 years ago
This network configuration is heavily dependant on the choice of the gene from which we build our correlation from.
In general the workflow is :
corr_with_a_row=values2corr.corrwith(values2corr.iloc[X,:],axis=1,method='pearson') # random row
My idea to face this is to compute many correlations using a random sampling of 20% of the genes in the population and explore how big are the error bars between all those datasets , that will indicate my uncertainty when I am doing certain network configuration.
This issue is to provide a first prototype of the generation of a interaction network based on highly correlated genes with respect the fitness scores they have . This data is from Benoit , on Dpl1 and the WT. The definition of the score is the same as Constanzo uses: e=f(ab)- f(a)f(b)
So far I can reproduce some of the existing interactions with this code , so that is encouraging! The figure below shows whether the existing interactors of dpl1 are reproduced or not with this code. If the gene has a '-new' label it means I give it this category but it is annotated as such.
The code can be found here: https://github.com/Gregory94/LaanLab-SATAY-DataAnalysis/blob/dev_Leila/Python_scripts/corrrelation-based-networks/fitness-WT-mutant-from-reads.py
The goal is to test all the steps from reads normalization-fitness-maps to see if the output has a subset most of the existing data.
Feel free to pull it and play with it!