probcomp / bayeslite

BayesDB on SQLite. A Bayesian database table for querying the probable implications of data as easily as SQL databases query the data itself.
http://probcomp.csail.mit.edu/software/bayesdb
Apache License 2.0
922 stars 64 forks source link

Improve handling of reasons in GUESS SCHEMA #481

Closed curlette closed 7 years ago

curlette commented 8 years ago

Currently, reasons for stattype guesses are encoded as comments enclosed in triple quotes. This format assumes the user will be using a triple quote string within a bdb.execute() command to create a population with the schema. With quotes, the schema text cannot be directly copy/pasted into an MML block with iVenture. There seems to be no way to include the reasons such that an MML cell in a Jupyter notebook could ignore them on execution.

Two potential plans for improving this:

  1. Modify MML such that there are two options: GUESS SCHEMA FOR <table> and GUESS SCHEMA FOR <table> WITH REASONS.
  2. Make GUESS SCHEMA FOR <table> create a new table in the bdb with three columns: column, guessed_stattype, and reason. Then, the user could edit stattypes by modifying the table. We would need to add an additional MML command such as PRINT SCHEMA STRING to convert the table to a string that could be copied and pasted into a CREATE POPULATION command.