Open Schaechtle opened 6 years ago
The loomcat
conversion is assuming that the loom project on disk was created using loomcat.initialize(state)
, in which case the loom column names are of the form c%05d
. This encoding scheme is needed since cgpm does not have a notion of natural language column names, only column numbers.
When the loom project on disk was created using BayesDB's loom_backend
class instead of cgpm's loomcat.initialize
method, the loom column names are the natural language column names since these are known to BayesDB.
The assumptions in the code should be further loosened by:
colname_colno_mapping
a required argument of loomcat._retrieve_featureid_to_cgpm
and loomcat._update_state
.loomcat.transition
and loomcat.transition_engine
to use the output mapping given by _generate_column_names
and pass this mapping into _update_state
.
This patch is needed since the column names I am getting from loom while calling
_update_state
(and_retrieve_featureid_to_cgpm
, specifically) don't follow the assumptions made in the code. In fact, the column names I am getting from loom are the actual natural language column names which seems correct (thus patching cgpm).I have a test case for this patch which depends on loom and bql -- I initially planned to include the test case here (i.e the cgpm repo). Looking at the test suite however, I am not sure anymore if this makes sense (not much loom and BQL around in
cgpm/test
); I am now planning to include this test in another repo focussed only on exporting models from loom. Open for suggestions here.Tested using the probcomp dev image as folllows:
Result: