cal-itp / data-infra

Cal-ITP data infrastructure
https://docs.calitp.org/data-infra
GNU Affero General Public License v3.0
47 stars 12 forks source link

Feature: function to print table documentation in calitp-py #523

Closed machow closed 2 years ago

machow commented 2 years ago

Currently, we store descriptions of tables and their columns in bigquery. It would be helpful if analysts could...

Is pretty quick to implement, and will let us experiment with using in documentation.

machow commented 2 years ago

TODO: flesh out with a bit more details w/ @Nkdiaz monday

For example, you can query a

from calitp.tables import tbl

tbl.views.reports_gtfs_schedule_index()

It would be useful if a user could print the table object and see helpful html docs

tbl.views.reports_gtfs_schedule_index

Steps

Pulling table documentation from sqlalchemy

# note this is the underlying sqlalchemy table
# it will hold table and column descriptions
table = tbl.views.reports_gtfs_schedule_index().tbl

Printing out helpful HTML docs

See https://ipython.readthedocs.io/en/stable/config/integrating.html#rich-display

It would be nice to have a display that when a user runs

tbl.views.reports_gtfs_schedule_index

It prints out

Edit: Replacing things like tbl.views.reports_gtfs_schedule_index with a class

Currently, we use a function factory to create functions like tbl.views.reports_gtfs_schedule_index, that when called return the LazyTbl. See this code...

https://github.com/cal-itp/calitp-py/blob/main/calitp/tables.py#L71-L73

Instead of returning a function, now we should return a class, like

class WarehouseTable:
    def __init__(self, engine table_name):
        self.engine = engine
        self.table_name = table_name

    def __call__(self):
        return LazyTbl(self.engine, self.table_name)

    def _repr_html_(self):
        return "<TABLE DOC STRING>"

You would create an instance of this class inside _table_factory

How to access column comments

Notice that

image

To get table descriptions, use tbl.description

machow commented 2 years ago

addressed by https://github.com/cal-itp/calitp-py/pull/25