Open djinnome opened 8 months ago
CRISPRi of fadR, tyrR, and lacI.
convert UMIcounts to anndata format: https://anndata.readthedocs.io/en/latest/
from anndata -> t-SNE
The code to map gene indexes to gene names is here:
https://github.com/PNNL-CompBio/SERGIO/blob/MN_ConIndependence-/GRN_Analysis/ParseInteractions.ipynb
compare the same 100 genes in MicroSPLIT to the 100 genes in SERGIO. In other words, filter the MicroSPLIT data to just the 100 genes in the SERGIO 100 gene DAG (including FadR) so that the SERGIO t-SNE can be compared to the MicroSPIT t-SNE
The code to map gene indexes to gene names is here:
https://github.com/PNNL-CompBio/SERGIO/blob/MN_ConIndependence-/GRN_Analysis/ParseInteractions.ipynb
Genes 'rutA', 'rutB', 'rutE', 'rutD', 'rutC' are not detected in the MicroSPLIT data. Gene 'gmr' is named 'pdeR' now.
So far, we can simulate the E. coli regulatory network using SERGIO in a "wild-type" scenario.
To simulate the E. coli regulatory network under a CRISPRa/i perturbation, we need to disconnect the target TF from its regulators, and either increase the production rate (CRISPRa) or decrease the production rate (CRISPRi).
Disconnecting a TF from its regulators essentially makes the TF a master regulator.
Since we have a separate input file for master regulators and a separate intput file for regulators that are targets of other regulators, the requested action is to move the TF from the
input_file_targets
toinput_file_regs
.The file format description for these two files is described here: https://github.com/PNNL-CompBio/SERGIO/blob/501c569a3541ae16457bfad8b206af07c9c9bb44/SERGIO/sergio.py#L132-L136
Since
# 4- input_file_taregts should not contain any line for master regulators
we need to cut the line that refers to the target TF. But when we paste it intoinput_file_regs
, we just need to proved the index of the target TF, and a production rate, one for each "bin". The production rate should be small for CRISPRi and large for CRISPRa compared to the production rate of other TFs.