simonw / datasette.io

The official project website for Datasette
https://datasette.io
92 stars 21 forks source link

Plugin directory #20

Closed simonw closed 3 years ago

simonw commented 3 years ago

This can be powered by a YAML file in the repo - but the build script can pull in latest release information too.

simonw commented 3 years ago

I'm going to start by listing these 45 (currently) repos that are tagged datasette-io and datasette-plugin: https://github-to-sqlite.dogsheep.net/github/repos?_facet_array=topics&topics__arraycontains=datasette-io&topics__arraycontains=datasette-plugin

simonw commented 3 years ago

If I power this from a YAML file in this repository, I can add additional data. Nice things to have would be:

I can also join against releases to be able to list plugins by most recently released.

simonw commented 3 years ago

Maybe also pull in PyPI download statistics?

simonw commented 3 years ago

Here's a query that highlights the most interesting available information about the plugins: https://github-to-sqlite.dogsheep.net/github?sql=select%0D%0A++id%2C%0D%0A++full_name%2C%0D%0A++description%2C%0D%0A++created_at%2C%0D%0A++updated_at%2C%0D%0A++stargazers_count%2C%0D%0A++topics%2C%0D%0A++open_issues%2C%0D%0A++watchers%2C%0D%0A++network_count%2C%0D%0A++subscribers_count%0D%0Afrom%0D%0A++repos%0D%0Awhere%0D%0A++rowid+in+%28%0D%0A++++select%0D%0A++++++repos.rowid%0D%0A++++from%0D%0A++++++repos%2C%0D%0A++++++json_each%28repos.topics%29+j%0D%0A++++where%0D%0A++++++j.value+%3D+%3Ap0%0D%0A++%29%0D%0A++and+rowid+in+%28%0D%0A++++select%0D%0A++++++repos.rowid%0D%0A++++from%0D%0A++++++repos%2C%0D%0A++++++json_each%28repos.topics%29+j%0D%0A++++where%0D%0A++++++j.value+%3D+%3Ap1%0D%0A++%29%0D%0Aorder+by%0D%0A++updated_at+desc%0D%0Alimit%0D%0A++101&p0=datasette-io&p1=datasette-plugin

simonw commented 3 years ago

I tried joining against the releases table but it turns out that demo currently only pulls releases for the Dogsheep apps + Datasette and sqlite-utils: https://github-to-sqlite.dogsheep.net/github/releases?_facet=repo

simonw commented 3 years ago

I'm going to try implementing this this as a custom templated faceted table.

simonw commented 3 years ago

Fun trick: I'm going to leave the data in the github-to-sqlite demo but I'll have a script that runs in datasette.io which pulls the JSON results of the above query and builds those into a local database table.

simonw commented 3 years ago

Here's a really fun trick:

curl -s "https://github-to-sqlite.dogsheep.net/github.json?sql=$(echo 'select
  id,
  full_name,
  description,
  created_at,
  updated_at,
  stargazers_count,
  topics,
  open_issues,
  watchers,
  network_count,
  subscribers_count
from
  repos
where
  rowid in (
    select
      repos.rowid
    from
      repos,
      json_each(repos.topics) j
    where
      j.value = "datasette-io"
  )
  and rowid in (
    select
      repos.rowid
    from
      repos,
      json_each(repos.topics) j
    where
      j.value = "datasette-plugin"
  )
order by
  updated_at desc
limit
  101' | python3 -c 'import sys; import urllib.parse; print(urllib.parse.quote(sys.stdin.read()))')&_shape=array" | jq .
simonw commented 3 years ago

Better idea: pull the list of plugin repos from a query against that demo, then run the repos command locally with --readme and --readme-html followed by the releases command.

simonw commented 3 years ago

/db/plugins - I will definite it as a view

simonw commented 3 years ago

Another way to get a list of plugins (as newline separated):

curl -s 'https://github-to-sqlite.dogsheep.net/github.csv?sql=select%0D%0A++full_name%0D%0Afrom%0D%0A++repos%0D%0Awhere%0D%0A++rowid+in+(%0D%0A++++select%0D%0A++++++repos.rowid%0D%0A++++from%0D%0A++++++repos%2C%0D%0A++++++json_each(repos.topics)+j%0D%0A++++where%0D%0A++++++j.value+%3D+%3Ap0%0D%0A++)%0D%0A++and+rowid+in+(%0D%0A++++select%0D%0A++++++repos.rowid%0D%0A++++from%0D%0A++++++repos%2C%0D%0A++++++json_each(repos.topics)+j%0D%0A++++where%0D%0A++++++j.value+%3D+%3Ap1%0D%0A++)%0D%0Aorder+by%0D%0A++updated_at+desc%0D%0Alimit%0D%0A++101&p0=datasette-io&p1=datasette-plugin&_size=max' | tail -n +2         

Or:

curl -s "https://github-to-sqlite.dogsheep.net/github.csv?sql=$(echo '
select
  full_name
from
  repos
where
  rowid in (
    select
      repos.rowid
    from
      repos,
      json_each(repos.topics) j
    where
      j.value = "datasette-io"
  )
  and rowid in (
    select
      repos.rowid
    from
      repos,
      json_each(repos.topics) j
    where
      j.value = "datasette-plugin"
  )
order by
  updated_at desc
' | python3 -c \
  'import sys; import urllib.parse; print(urllib.parse.quote(sys.stdin.read()))')" \
  | tail -n +2
simonw commented 3 years ago

I'm tempted to try automating the screenshots of the plugins using puppeteer-cli, as seen in https://github.com/simonw/til

simonw commented 3 years ago

I definitely want to ensure that any repo that I create which I tag with datasette-plugin and datasette-io shows up in the plugin directory automatically, with no extra steps needed.

I'll have a separate, reviewed list of repos for plugins written by other people - probably just a newline-separated text file with the repository names in it.

simonw commented 3 years ago

I should include the plugin hooks used by each plugin. Can get these from datasette plugins output after installing them.

simonw commented 3 years ago

New plan: I'm going to fetch these from the GitHub GraphQL API:

query { 
  search(query:"topic:datasette-io topic:datasette-plugin user:simonw" type:REPOSITORY, first:100) {
    repositoryCount
    nodes {
      ... on Repository {
        id
        nameWithOwner
        homepageUrl
        description
        openGraphImageUrl
        usesCustomOpenGraphImage
        stargazerCount
        forkCount
        updatedAt
        createdAt
        pushedAt
        openIssues: issues(states:[OPEN]) {
          totalCount
        }
        closedIssues: issues(states:[CLOSED]) {
          totalCount
        }
        releases(last: 1) {
          totalCount
          nodes {
            id
            name
            tagName
            description
            descriptionHTML
            shortDescriptionHTML
            createdAt
            publishedAt
          }
        } 
      }
    }
  } 
}

https://developer.github.com/v4/explorer/

This lets me grab all of the plugins AND their most recent release in a single call.

simonw commented 3 years ago

Option for screenshots: set them as the open graph image for each repository, then retrieve them with openGraphImageUrl and usesCustomOpenGraphImage in the GraphQL query.

Those images can be 1280×640px which is about right for a screenshot.

simonw commented 3 years ago

I added an image to datasette-ripgrep to try that out: https://repository-images.githubusercontent.com/315796015/09e82680-3978-11eb-9828-94b780b592bb

simonw commented 3 years ago

Here's a script I wrote that uses the GitHub GraphQL API https://github.com/simonw/simonw/blob/main/build_readme.py

simonw commented 3 years ago

Next step: define a SQL view to use for the plugin directory itself, then build a custom template for displaying it.

simonw commented 3 years ago

Now live at https://preview.datasette.io/content/plugins

simonw commented 3 years ago

Turns out some of the plugins never had a GitHub release - I must have released them manually: https://preview.datasette.io/content?sql=select+*+from+plugins+where+not+exists+%28select+*+from+releases+where+releases.plugin+%3D+plugins.id%29

simonw commented 3 years ago

Here's a query that joins against releases and pulls in latest release date and latest release name: https://preview.datasette.io/content?sql=select%0D%0A++plugins.name%2C%0D%0A++plugins.description%2C%0D%0A++releases.tagName%2C%0D%0A++max%28releases.createdAt%29+as+latestReleaseAt%2C%0D%0A++homepageUrl%2C%0D%0A++topics%2C%0D%0A++nameWithOwner%2C%0D%0A++openGraphImageUrl%2C%0D%0A++usesCustomOpenGraphImage%2C%0D%0A++stargazerCount%2C%0D%0A++forkCount%2C%0D%0A++plugins.updatedAt%2C%0D%0A++plugins.createdAt%2C%0D%0A++plugins.pushedAt%2C%0D%0A++openIssueCount%2C%0D%0A++closedIssueCount%2C%0D%0A++releaseCount%0D%0Afrom%0D%0A++plugins%0D%0A++join+releases+on+plugins.id+%3D+releases.plugin%0D%0Agroup+by%0D%0A++plugins.id%0D%0Aorder+by%0D%0A++latestReleaseAt+desc

simonw commented 3 years ago

Here's the first live version: https://preview.datasette.io/plugins

Datasette_Plugins
simonw commented 3 years ago

Idea: import statistics from https://pypistats.org/packages/datasette-vega

API is here, returns last 180 days of daily downloads: https://pypistats.org/api/packages/datasette-vega/overall?mirrors=true

I could render these using sparklines: https://github.com/fnando/sparkline

simonw commented 3 years ago

It's good enough for launch. I'll add more screenshots at some point soon too.