Open djinnome opened 9 months ago
The current enzyme activity function generates enzymes with Inf
values if there is no expression data. The solution to this issue will align the interface to handle missing rows, missing values and missing columns
I will help with this
Initial tests to build and pass:
inputs: dataframe: ncond x nvariables outputs: pytensor: ncond x nvariables test that tensor_equal(example_tensor, make_observables( example_df )). Can use data from hackett or Wu et al.
@ShantMahserejian can we get a dataframe for each data type where the column names are all the conditions, and rows are the model ids for the data type, and a cell is a float if it was measured, a Inf if it isn't measured, and Nan if no measurement can be mapped to the model id (for example, reactions that are not enzyme-catalyzed).
All data sucks. Bayesian MCA naively assumes that the matrix of metabolite, flux and enzyme observations has the same size as the stoichiometric matrix rows or columns.
There is a tensor trick to reindex a tensor so that a smaller tensor can represent the observed data and its rows map to the rows of the full stoichiometric matrix, but only ChatGPT knows how this works.