Closed noborus closed 2 years ago
Now that you can handle NULL, https://www.brimdata.io/blog/introducing-zq/ The above issues can also be aggregated using the SQL of trdsql.
zq example:
zq -j openlibrary.json | jq '[.docs[] |
{title,author_name: .author_name[0], publish_year: .publish_year[0]} |
select(.author_name!=null and .publish_year!=null)] | group_by(.author_name)|
[.[] | {author_name: .[0].author_name, count: . |
length}] | sort_by(.count) | reverse | limit(3;.[])'
trdsql example:
trdsql -ojsonl
"SELECT json(author_name)->>0 AS author_name, count(*) AS count
FROM openlibrary.json::.docs
WHERE author_name IS NOT NULL AND publish_year IS NOT NULL
GROUP BY json(author_name)->>0
ORDER BY count DESC LIMIT 3"
Unidode normalize is not yet in sqlite3. But PostgreSQL has it.
trdsql -ojsonl -driver postges -dsn "dbname=trdsql_db"
"SELECT json(normalize(author_name))->>0 AS author_name, count(*) AS count
FROM openlibrary.json::.docs
WHERE author_name IS NOT NULL AND publish_year IS NOT NULL
GROUP BY json(normalize(author_name))->>0
ORDER BY count DESC LIMIT 3"
{"author_name":"S. Stepniak","count":38}
{"author_name":"Władysław Stępniak","count":11}
{"author_name":"Andrzej Stępniak","count":4}
Changed JSON null to SQL null. Enabled to specify a character string that is NULL in the input format (-inull). Enabled to specify SQL NULL in the output format string (-onull). If
-inull
is specified in JSON, the specified character string will be NULL in addition to null. Formats that do not support null will default to an empty string ("").Resolve #182.