Closed simonw closed 3 years ago
I'm going to start by listing these 45 (currently) repos that are tagged datasette-io
and datasette-plugin
: https://github-to-sqlite.dogsheep.net/github/repos?_facet_array=topics&topics__arraycontains=datasette-io&topics__arraycontains=datasette-plugin
If I power this from a YAML file in this repository, I can add additional data. Nice things to have would be:
I can also join against releases
to be able to list plugins by most recently released.
Maybe also pull in PyPI download statistics?
Here's a query that highlights the most interesting available information about the plugins: https://github-to-sqlite.dogsheep.net/github?sql=select%0D%0A++id%2C%0D%0A++full_name%2C%0D%0A++description%2C%0D%0A++created_at%2C%0D%0A++updated_at%2C%0D%0A++stargazers_count%2C%0D%0A++topics%2C%0D%0A++open_issues%2C%0D%0A++watchers%2C%0D%0A++network_count%2C%0D%0A++subscribers_count%0D%0Afrom%0D%0A++repos%0D%0Awhere%0D%0A++rowid+in+%28%0D%0A++++select%0D%0A++++++repos.rowid%0D%0A++++from%0D%0A++++++repos%2C%0D%0A++++++json_each%28repos.topics%29+j%0D%0A++++where%0D%0A++++++j.value+%3D+%3Ap0%0D%0A++%29%0D%0A++and+rowid+in+%28%0D%0A++++select%0D%0A++++++repos.rowid%0D%0A++++from%0D%0A++++++repos%2C%0D%0A++++++json_each%28repos.topics%29+j%0D%0A++++where%0D%0A++++++j.value+%3D+%3Ap1%0D%0A++%29%0D%0Aorder+by%0D%0A++updated_at+desc%0D%0Alimit%0D%0A++101&p0=datasette-io&p1=datasette-plugin
I tried joining against the releases
table but it turns out that demo currently only pulls releases for the Dogsheep apps + Datasette and sqlite-utils: https://github-to-sqlite.dogsheep.net/github/releases?_facet=repo
I'm going to try implementing this this as a custom templated faceted table.
Fun trick: I'm going to leave the data in the github-to-sqlite
demo but I'll have a script that runs in datasette.io
which pulls the JSON results of the above query and builds those into a local database table.
Here's a really fun trick:
curl -s "https://github-to-sqlite.dogsheep.net/github.json?sql=$(echo 'select
id,
full_name,
description,
created_at,
updated_at,
stargazers_count,
topics,
open_issues,
watchers,
network_count,
subscribers_count
from
repos
where
rowid in (
select
repos.rowid
from
repos,
json_each(repos.topics) j
where
j.value = "datasette-io"
)
and rowid in (
select
repos.rowid
from
repos,
json_each(repos.topics) j
where
j.value = "datasette-plugin"
)
order by
updated_at desc
limit
101' | python3 -c 'import sys; import urllib.parse; print(urllib.parse.quote(sys.stdin.read()))')&_shape=array" | jq .
Better idea: pull the list of plugin repos from a query against that demo, then run the repos command locally with --readme and --readme-html followed by the releases command.
/db/plugins - I will definite it as a view
Another way to get a list of plugins (as newline separated):
curl -s 'https://github-to-sqlite.dogsheep.net/github.csv?sql=select%0D%0A++full_name%0D%0Afrom%0D%0A++repos%0D%0Awhere%0D%0A++rowid+in+(%0D%0A++++select%0D%0A++++++repos.rowid%0D%0A++++from%0D%0A++++++repos%2C%0D%0A++++++json_each(repos.topics)+j%0D%0A++++where%0D%0A++++++j.value+%3D+%3Ap0%0D%0A++)%0D%0A++and+rowid+in+(%0D%0A++++select%0D%0A++++++repos.rowid%0D%0A++++from%0D%0A++++++repos%2C%0D%0A++++++json_each(repos.topics)+j%0D%0A++++where%0D%0A++++++j.value+%3D+%3Ap1%0D%0A++)%0D%0Aorder+by%0D%0A++updated_at+desc%0D%0Alimit%0D%0A++101&p0=datasette-io&p1=datasette-plugin&_size=max' | tail -n +2
Or:
curl -s "https://github-to-sqlite.dogsheep.net/github.csv?sql=$(echo '
select
full_name
from
repos
where
rowid in (
select
repos.rowid
from
repos,
json_each(repos.topics) j
where
j.value = "datasette-io"
)
and rowid in (
select
repos.rowid
from
repos,
json_each(repos.topics) j
where
j.value = "datasette-plugin"
)
order by
updated_at desc
' | python3 -c \
'import sys; import urllib.parse; print(urllib.parse.quote(sys.stdin.read()))')" \
| tail -n +2
I'm tempted to try automating the screenshots of the plugins using puppeteer-cli, as seen in https://github.com/simonw/til
I definitely want to ensure that any repo that I create which I tag with datasette-plugin
and datasette-io
shows up in the plugin directory automatically, with no extra steps needed.
I'll have a separate, reviewed list of repos for plugins written by other people - probably just a newline-separated text file with the repository names in it.
I should include the plugin hooks used by each plugin. Can get these from datasette plugins
output after installing them.
New plan: I'm going to fetch these from the GitHub GraphQL API:
query {
search(query:"topic:datasette-io topic:datasette-plugin user:simonw" type:REPOSITORY, first:100) {
repositoryCount
nodes {
... on Repository {
id
nameWithOwner
homepageUrl
description
openGraphImageUrl
usesCustomOpenGraphImage
stargazerCount
forkCount
updatedAt
createdAt
pushedAt
openIssues: issues(states:[OPEN]) {
totalCount
}
closedIssues: issues(states:[CLOSED]) {
totalCount
}
releases(last: 1) {
totalCount
nodes {
id
name
tagName
description
descriptionHTML
shortDescriptionHTML
createdAt
publishedAt
}
}
}
}
}
}
https://developer.github.com/v4/explorer/
This lets me grab all of the plugins AND their most recent release in a single call.
Option for screenshots: set them as the open graph image for each repository, then retrieve them with openGraphImageUrl
and usesCustomOpenGraphImage
in the GraphQL query.
Those images can be 1280×640px which is about right for a screenshot.
I added an image to datasette-ripgrep to try that out: https://repository-images.githubusercontent.com/315796015/09e82680-3978-11eb-9828-94b780b592bb
Here's a script I wrote that uses the GitHub GraphQL API https://github.com/simonw/simonw/blob/main/build_readme.py
Next step: define a SQL view to use for the plugin directory itself, then build a custom template for displaying it.
Now live at https://preview.datasette.io/content/plugins
Turns out some of the plugins never had a GitHub release - I must have released them manually: https://preview.datasette.io/content?sql=select+*+from+plugins+where+not+exists+%28select+*+from+releases+where+releases.plugin+%3D+plugins.id%29
Here's a query that joins against releases and pulls in latest release date and latest release name: https://preview.datasette.io/content?sql=select%0D%0A++plugins.name%2C%0D%0A++plugins.description%2C%0D%0A++releases.tagName%2C%0D%0A++max%28releases.createdAt%29+as+latestReleaseAt%2C%0D%0A++homepageUrl%2C%0D%0A++topics%2C%0D%0A++nameWithOwner%2C%0D%0A++openGraphImageUrl%2C%0D%0A++usesCustomOpenGraphImage%2C%0D%0A++stargazerCount%2C%0D%0A++forkCount%2C%0D%0A++plugins.updatedAt%2C%0D%0A++plugins.createdAt%2C%0D%0A++plugins.pushedAt%2C%0D%0A++openIssueCount%2C%0D%0A++closedIssueCount%2C%0D%0A++releaseCount%0D%0Afrom%0D%0A++plugins%0D%0A++join+releases+on+plugins.id+%3D+releases.plugin%0D%0Agroup+by%0D%0A++plugins.id%0D%0Aorder+by%0D%0A++latestReleaseAt+desc
Here's the first live version: https://preview.datasette.io/plugins
Idea: import statistics from https://pypistats.org/packages/datasette-vega
API is here, returns last 180 days of daily downloads: https://pypistats.org/api/packages/datasette-vega/overall?mirrors=true
I could render these using sparklines: https://github.com/fnando/sparkline
It's good enough for launch. I'll add more screenshots at some point soon too.
This can be powered by a YAML file in the repo - but the build script can pull in latest release information too.