Closed abdusco closed 1 year ago
Thanks for the bug report!
Despite the .md
extension those URLs do actually return valid images - the .md
extension is the key for that row in the database. Here's what's supposed to happen:
https://til.simonwillison.net/-/media/screenshot/github-actions_cache-setup-py.md
Or in an image tag:
But.. it looks like for that entry the image URL is returning a 500 error:
https://til.simonwillison.net/-/media/screenshot/sqlite_multiple-indexes.md
I looked here: https://til.simonwillison.net/tils/til?path__exact=sqlite_multiple-indexes.md&_sort_desc=updated_utc - and it looks like it's got a 0 length binary string for the shot
column.
There are actually 7 rows that have 0 byes for their image right now: https://til.simonwillison.net/tils/til?_where=length(shot)%20==%200
The 500 error is a bug in datasette-media
where if content
is 0 it attempts to return a non-existent file instead:
So the real bug here is why did those screenshots get generated as 0 byte images?
I've modified the generate_screenshots.py
script to also regenerate any shots that are blank for whatever reason.
select
'https://til.simonwillison.net/' || topic || '/' || slug as url,
'https://til.simonwillison.net/-/media/screenshot/' || path as screenshot_url,
length(shot) as shot_length
from til where length(shot) == 0 order by updated_utc desc limit 101
This query will return the "after" set once the fix has gone out:
select
'https://til.simonwillison.net/' || topic || '/' || slug as url,
'https://til.simonwillison.net/-/media/screenshot/' || path as screenshot_url,
length(shot) as shot_length
from
til
where
path in (
'github_github-pages.md',
'googlecloud_gcloud-error-workaround.md',
'github_github-code-search-api-uses.md',
'gpt3_reformatting-text-with-copilot.md',
'pytest_show-files-opened-by-tests.md',
'spatialite_viewing-geopackage-data-with-spatialite-and-datasette.md',
'sqlite_multiple-indexes.md'
)
That should have fixed it, but it didn't - here's the bit of the workflow where it failed:
https://github.com/simonw/til/actions/runs/3688947164/jobs/6244340939
Got 0 byte PNG for github_github-pages.md shot hash e54cea9896f7a466d6e5703ce107b2c5
Skipped mastodon_custom-domain-mastodon.md with shot hash ce0d0a68f4f0d7c51a91218ff2164456
Skipped mastodon_export-timeline-to-sqlite.md with shot hash e32ea7e0b44d57dd308e7fe3b2c8756a
Skipped gpt3_open-api.md with shot hash aec4a0f370d664069b942c6865274af8
Skipped json_json-pointer.md with shot hash 60052a94919658a34dc4ec7c02a644cc
Skipped gpt3_writing-test-with-copilot.md with shot hash 1cac30332578aaad53d5cd0a182a4e9b
Skipped html_datalist.md with shot hash 02acf7f812bf857fc830eb7f6089a7f4
Skipped git_git-archive.md with shot hash 98b0f5b70ef13976621959e7f7f50c4b
Skipped mastodon_verifying-github-on-mastodon.md with shot hash afe6e349ca47854f146c8ef1e1f13d66
Skipped observable-plot_wider-tooltip-areas.md with shot hash 0871f9336adcebf0ec24da5f9373fed9
Skipped datasette_cli-tool-that-is-also-a-plugin.md with shot hash cc42d4afa9ad1cb25ebeedcba62b70bd
Skipped html_lazy-loading-images.md with shot hash 052959fdc6df4e0f75c906f7115bc847
Skipped github-actions_cache-setup-py.md with shot hash b49a840f36dcf80a29a6f71227e0a753
Skipped docker_pipenv-and-docker.md with shot hash 8ef776fcdce11807608c7892b27a5e3c
Got 0 byte PNG for googlecloud_gcloud-error-workaround.md shot hash 747e75e8be74e58e42482751143e4f90
Got 0 byte PNG for github_github-code-search-api-uses.md shot hash 269f456f68266291edb4a7e354fbd961
Got 0 byte PNG for gpt3_reformatting-text-with-copilot.md shot hash 387afeea2ed9b02c5a9fb8a10745fd3d
Got 0 byte PNG for pytest_show-files-opened-by-tests.md shot hash 80495a72ee192edf259b462489e0c933
Got 0 byte PNG for spatialite_viewing-geopackage-data-with-spatialite-and-datasette.md shot hash 02b4ec6523a8611d50e60121f2f8160c
Got 0 byte PNG for sqlite_multiple-indexes.md shot hash 42b8a982ad6ca28674ce61212dcce57d
It's the same 7 images again.
The weird thing is that those screenshots generate just fine when I run that script on my laptop.
I'm going to switch to https://shot-scraper.datasette.io/ for screenshots and see if that helps
Looks like shot-scraper
worked in the GitHub Actions run:
Skipped python_pdb-interact.md with shot hash 3de66a55d705fdadd6fab5b7d1dc1ff0
Got 65373 byte PNG for github_github-pages.md shot hash e54cea9896f7a466d6e5703ce107b2c5
Skipped mastodon_custom-domain-mastodon.md with shot hash ce0d0a68f4f0d7c51a91218ff2164456
Skipped mastodon_export-timeline-to-sqlite.md with shot hash e32ea7e0b44d57dd308e7fe3b2c8756a
Skipped gpt3_open-api.md with shot hash aec4a0f370d664069b942c6865274af8
Skipped json_json-pointer.md with shot hash 60052a94919658a34dc4ec7c02a644cc
Skipped gpt3_writing-test-with-copilot.md with shot hash 1cac30332578aaad53d5cd0a182a4e9b
Skipped html_datalist.md with shot hash 02acf7f812bf857fc830eb7f6089a7f4
Skipped git_git-archive.md with shot hash 98b0f5b70ef13976621959e7f7f50c4b
Skipped mastodon_verifying-github-on-mastodon.md with shot hash afe6e349ca47854f146c8ef1e1f13d66
Skipped observable-plot_wider-tooltip-areas.md with shot hash 0871f9336adcebf0ec24da5f9373fed9
Skipped datasette_cli-tool-that-is-also-a-plugin.md with shot hash cc42d4afa9ad1cb25ebeedcba62b70bd
Skipped html_lazy-loading-images.md with shot hash 052959fdc6df4e0f75c906f7115bc847
Skipped github-actions_cache-setup-py.md with shot hash b49a840f36dcf80a29a6f71227e0a753
Skipped docker_pipenv-and-docker.md with shot hash 8ef776fcdce11807608c7892b27a5e3c
Got 59593 byte PNG for googlecloud_gcloud-error-workaround.md shot hash 747e75e8be74e58e42482751143e4f90
Got 81047 byte PNG for github_github-code-search-api-uses.md shot hash 269f456f68266291edb4a7e354fbd961
Got 71664 byte PNG for gpt3_reformatting-text-with-copilot.md shot hash 387afeea2ed9b02c5a9fb8a10745fd3d
Got 51927 byte PNG for pytest_show-files-opened-by-tests.md shot hash 80495a72ee192edf259b462489e0c933
Got 67890 byte PNG for spatialite_viewing-geopackage-data-with-spatialite-and-datasette.md shot hash 02b4ec6523a8611d50e60121f2f8160c
Got 103632 byte PNG for sqlite_multiple-indexes.md shot hash 42b8a982ad6ca28674ce61212dcce57d
That fixed it!
Ran this query to generate the following:
select
group_concat('https://til.simonwillison.net/' || topic || '/' || slug || '
![](https://til.simonwillison.net/-/media/screenshot/' || path || ')', '
') as screenshot_url
from
til
where
path in (
"github_github-pages.md",
"googlecloud_gcloud-error-workaround.md",
"github_github-code-search-api-uses.md",
"gpt3_reformatting-text-with-copilot.md",
"pytest_show-files-opened-by-tests.md",
"spatialite_viewing-geopackage-data-with-spatialite-and-datasette.md",
"sqlite_multiple-indexes.md"
)
https://til.simonwillison.net/github/github-code-search-api-uses
https://til.simonwillison.net/github/github-pages
https://til.simonwillison.net/googlecloud/gcloud-error-workaround
https://til.simonwillison.net/gpt3/reformatting-text-with-copilot
https://til.simonwillison.net/pytest/show-files-opened-by-tests
https://til.simonwillison.net/spatialite/viewing-geopackage-data-with-spatialite-and-datasette
Hey, just noticed that this post has a broken image in my RSS reader
with an alt text of
sqlite_multiple_indexes.md
. This caught my attention and I viewed the source and realized that social images meta tags are linked to the source file.https://github.com/simonw/til/blob/2e7920e9a55b93cf9164639d9aad23c98f96bed5/templates/pages/%7Btopic%7D/%7Bslug%7D.html#L20-L26