simonw / datasette-edit-schema

Datasette plugin for modifying table schemas
Apache License 2.0
16 stars 0 forks source link

Idea: show some sample values next to each column #28

Closed simonw closed 1 year ago

simonw commented 3 years ago

To help people decide which column types to use, could show maybe the first five blank values from the table. Could be a tooltip.

simonw commented 3 years ago

Perhaps where these pink arrows are:

Edit_table_data_sfgov_org_api_views_88g8-5mnd_rows_csv_in_data
simonw commented 3 years ago

Showing them in a line below the column details would probably fit better.

simonw commented 1 year ago

To do this efficiently, I think we select the first 1000 rows ordered by primary key, then within those select the first 5 values that are not null or blank.

simonw commented 1 year ago
with rows as (
  select * from pypi_packages limit 1000
),
names as (
  select name from rows where ("name" is not null and "name" != "") limit 5
),
classifiers as (
  select distinct classifiers from rows where ("classifiers" is not null and "classifiers" != "") limit 5
)
select 'name', name from names union all select distinct 'classifiers', classifiers from classifiers

https://datasette.io/content?sql=with+rows+as+%28%0D%0A++select+*+from+pypi_packages+limit+1000%0D%0A%29%2C%0D%0Anames+as+%28%0D%0A++select+name+from+rows+where+%28%22name%22+is+not+null+and+%22name%22+%21%3D+%22%22%29+limit+5%0D%0A%29%2C%0D%0Aclassifiers+as+%28%0D%0A++select+distinct+classifiers+from+rows+where+%28%22classifiers%22+is+not+null+and+%22classifiers%22+%21%3D+%22%22%29+limit+5%0D%0A%29%0D%0Aselect+%27name%27%2C+name+from+names+union+all+select+distinct+%27classifiers%27%2C+classifiers+from+classifiers%0D%0A

simonw commented 1 year ago

Better query:

https://datasette.io/content?sql=with+rows+as+%28%0D%0A++select+*+from+pypi_packages+limit+1000%0D%0A%29%2C%0D%0Anames+as+%28%0D%0A++select+name+from+rows+where+%28%22name%22+is+not+null+and+%22name%22+%21%3D+%22%22%29+limit+5%0D%0A%29%2C%0D%0Aclassifiers+as+%28%0D%0A++select+distinct+classifiers+from+rows+where+%28%22classifiers%22+is+not+null+and+%22classifiers%22+%21%3D+%22%22%29+limit+5%0D%0A%29%2C%0D%0Astrings+as+%28%0D%0A++select+%27name%27+as+label%2C+name+as+value+from+names+union+all+select+distinct+%27classifiers%27%2C+classifiers+from+classifiers%0D%0A%29%0D%0Aselect+label%2C+json_group_array%28value%29+as+examples+from+strings+group+by+label

with rows as (
  select * from pypi_packages limit 1000
),
names as (
  select name from rows where ("name" is not null and "name" != "") limit 5
),
classifiers as (
  select distinct classifiers from rows where ("classifiers" is not null and "classifiers" != "") limit 5
),
strings as (
  select 'name' as label, name as value from names union all select distinct 'classifiers', classifiers from classifiers
)
select label, json_group_array(value) as examples from strings group by label
simonw commented 1 year ago

I should truncate that as well.

simonw commented 1 year ago
with rows as (
  select * from pypi_packages limit 1000
),
names as (
  select name from rows where ("name" is not null and "name" != "") limit 5
),
classifiers as (
  select distinct classifiers from rows where ("classifiers" is not null and "classifiers" != "") limit 5
),
strings as (
  select 'name' as label, name as value from names
  union all
  select distinct 'classifiers', classifiers from classifiers
),
truncated_strings as (
  select 
    label,
    case 
      when length(value) > 30 then substr(value, 1, 30) || '...'
      else value
    end as value
  from strings
)
select label, json_group_array(value) as examples from truncated_strings group by label;

https://datasette.io/content?sql=with+rows+as+%28%0D%0A++select+*+from+pypi_packages+limit+1000%0D%0A%29%2C%0D%0Anames+as+%28%0D%0A++select+name+from+rows+where+%28%22name%22+is+not+null+and+%22name%22+%21%3D+%22%22%29+limit+5%0D%0A%29%2C%0D%0Aclassifiers+as+%28%0D%0A++select+distinct+classifiers+from+rows+where+%28%22classifiers%22+is+not+null+and+%22classifiers%22+%21%3D+%22%22%29+limit+5%0D%0A%29%2C%0D%0Astrings+as+%28%0D%0A++select+%27name%27+as+label%2C+name+as+value+from+names%0D%0A++union+all%0D%0A++select+distinct+%27classifiers%27%2C+classifiers+from+classifiers%0D%0A%29%2C%0D%0Atruncated_strings+as+%28%0D%0A++select+%0D%0A++++label%2C%0D%0A++++case+%0D%0A++++++when+length%28value%29+%3E+30+then+substr%28value%2C+1%2C+30%29+%7C%7C+%27...%27%0D%0A++++++else+value%0D%0A++++end+as+value%0D%0A++from+strings%0D%0A%29%0D%0Aselect+label%2C+json_group_array%28value%29+as+examples+from+truncated_strings+group+by+label%3B

simonw commented 1 year ago

Got this working, it looks like this:

CleanShot 2023-09-08 at 15 51 19@2x