simonw / sqlite-utils

Python CLI utility and library for manipulating SQLite databases
https://sqlite-utils.datasette.io
Apache License 2.0
1.62k stars 109 forks source link

New options for analyze-tables --common-limit --no-most and --no-least #544

Closed simonw closed 1 year ago

simonw commented 1 year ago

The "least common" section is frequently uninteresting, especially for huge tables with a large number of repeated-once values.

sqlite-utils analyze-tables content.db repos --common-limit 20 --no-least
simonw commented 1 year ago

I generated the commit message in https://github.com/simonw/sqlite-utils/commit/1c1991b447a1ddd3d61d9d4a8a1d6a9da47ced20 using git diff | llm --system 'describe this change'.

simonw commented 1 year ago

New docs:

New help output:

 % sqlite-utils analyze-tables --help
Usage: sqlite-utils analyze-tables [OPTIONS] PATH [TABLES]...

  Analyze the columns in one or more tables

  Example:

      sqlite-utils analyze-tables data.db trees

Options:
  -c, --column TEXT       Specific columns to analyze
  --save                  Save results to _analyze_tables table
  --common-limit INTEGER  How many common values
  --no-most               Skip most common values
  --no-least              Skip least common values
  --load-extension TEXT   Path to SQLite extension, with optional :entrypoint
  -h, --help              Show this message and exit.