tabulate creates tables and columns for all fields in the schema, which makes it hard for users to understand scope and coverage.
I'm not sure how much of a priority tabulate is given that the registry will be available soon, but I wanted to make a note of the following Python script, which removes empty tables and columns from the database created by tabulate.
This script is for a SQLite database and uses the %sql line magic from ipython-sql:
# Get the table names from the database and put them in a Pandas dataframe
tables = %sql SELECT name FROM sqlite_master WHERE type = 'table';
for table in tables['name']:
# Check if the table contains any rows
table_check = %sql SELECT * FROM {table} limit 1;
# Drop empty tables
if len(table_check) == 0:
%sql DROP TABLE {table};
else:
# Get the column names from the table and put them in a Pandas dataframe
columns = %sql pragma table_info({table});
non_empty_columns = []
for column in columns['name']:
# Check if the column contains any data
column_check = %sql SELECT max({column}) as max from {table};
if column_check['max'][0] != None:
non_empty_columns.append(column)
cols = ', '.join(non_empty_columns)
# Recreate tables without empty columns (SQLite doesn't support DROP COLUMN)
%sql CREATE TABLE {table}_new ({cols});
%sql INSERT INTO {table}_new SELECT {cols} FROM {table};
%sql DROP TABLE {table};
%sql ALTER TABLE {table}_new RENAME TO {table};
Edit: This script could be optimized, either by using create view as select {{non_empty_cols}} from table, or by using one select statement per table to work out the empty fields, at the expense of more complex code to generate the select statement. However, for the purpose of a tutorial database with ~6,000 releases, this code was fast enough (3s).
tabulate
creates tables and columns for all fields in the schema, which makes it hard for users to understand scope and coverage.I'm not sure how much of a priority
tabulate
is given that the registry will be available soon, but I wanted to make a note of the following Python script, which removes empty tables and columns from the database created bytabulate
.This script is for a SQLite database and uses the
%sql
line magic from ipython-sql:Edit: This script could be optimized, either by using
create view as select {{non_empty_cols}} from table
, or by using one select statement per table to work out the empty fields, at the expense of more complex code to generate the select statement. However, for the purpose of a tutorial database with ~6,000 releases, this code was fast enough (3s).