I refactored the CellOracle worfklow and am currently testing for robustness. The major changes are:
Cicero parameters can be specified in the config. Including a precomputed dimensionality reduction for building the kNN graph used for aggregation
Cluster key in mdata.obs needs to be specified in the config
r2g and tf2r scripts now only consider genes and TFs present in the starting MuData. This means that you can subset it ahead of time and only those genes and TFs will be present in any of the outputs.
grn script now doesn't do any of the steps required for CellOracle simulations. Now you just specify a layer and a bagging ridge regression is fit for each target gene using the TFs constrained by the r2g and tf2r steps
Main outputs are r2g.csv, tf2r.csv and grn.csv that are structured very similar to previously with some minor edits to column names. These are stored in the uns as dictionaries with keys "r2g", "tf2r" and "grn" respectively.
I refactored the CellOracle worfklow and am currently testing for robustness. The major changes are: