tplive / comicbot

Scheduled job that will fetch comics from the Teknisk Ukeblad (tu.no) website and XKCD and post links to Slack.
MIT License
2 stars 2 forks source link

Can we support Comic Agilè? #9

Open tplive opened 2 years ago

tplive commented 2 years ago

Comic Agilè is a Danish comic about all things DevOps. Can we make Comicbot work with it?

Initially, there is no "order" to the comics, apart from their numbering scheme (today's comic is #172) so keeping track of which one's have already been fetched and posted, could pose a challenge. But maybe we can keep state in a counter backend? I found CountAPI which could work - provided we are able to configure it with a persistent domain name for namespace creation. We just need to know the number of the last comic we fetched... So, we would get the value first, compare it to whatever is on the front page of Comic Agilè, and if it's newer, fetch the comic, then update the count API with it's new number. Easy as pie!

Or could it be done EVEN simpler? Can we keep track of values in Github Actions?

tplive commented 1 year ago

I have experimented a bit with the comic's URL structure, and storing the comic's URLs with kvdb.io.

Example URL: https://i0.wp.com/www.comicagile.net/wp-content/uploads/2022/12/Comic-agile_224.jpg\?ssl\=1

There is an arbitrary number of comics for each /year/month/, it can also be 0.

  1. Initialize the solution by storing the latest URL manually at kvdb.io. Use curl -d '<URL>' https://kvdb.io/<BUCKET_ID>/<PREFIX><NUMBER> so for instance `curl -d 'https://i0.wp.com/www.comicagile.net/wp-content/uploads/2022/12/Comic-agile_224.jpg\?ssl\=1' https://kvdb.io/xxxxxxxxxxxxxxxxxxxxxxxx/comic_agile_224

Then run discovery every day:

  1. Get stored keys with prefix "comicagile".
  2. Find the highest number.
  3. Get the URL from the highest number.
  4. Get the month and year from the URL.
  5. Increment the comic number, try to get URL for that. If there is no comic at that URL, it will return an error message "We cannot complete this request, remote data could not be fetched"
  6. If it fails, increment month and try again. Up to the current month.
  7. Also increment year, up to the current year.
  8. If it still fails, there is no comic with that number. Exit.
  9. If we find a valid URL, store it with the incremented comic number.

We may need a way to download all URLs to avoid losing the "streak", since a kvdb.io free account is a trial and will expire. Can be solved with paid account. Or finding a free service.

tplive commented 1 year ago

Or could it be done EVEN simpler? Can we keep track of values in Github Actions?

https://docs.github.com/en/actions/using-workflows/storing-workflow-data-as-artifacts

It looks like you can only pass artifacts between jobs in the same workflow run.