Closed martinhpedersen closed 2 years ago
In case we want to pursue this further, I threw together a small app that we could use to generate this JSON object. It writes to stdout.
https://gist.github.com/martinhpedersen/337b08b6617dcf256eac3934e48a8a3d
Output:
{
"version": "1.0.171",
"archive_url": "https://api.onedrive.com/v1.0/shares/............",
"_generated": "2021-12-03T22:46:46.47993Z"
}
Great idea to get that out of our installed binary! We could look at using something like an AWS Lambda/GCP Function for hosting this. That could work either as a synchronous HTTP call or a scheduled save to S3/GCS. I do like the idea of the S3/GCS hosted file: it's not going to change that often and it buffers the Winlink website from extra load. Plus, if there's an execution problem with the scraper, we can continue serving the last known version info.
Let me know if you want help setting that up. I'm using GCP Functions for https://github.com/k0swe/forester-func.
Actually, GitHub Pages would be very viable for this, too.
How about a scheduled Github Actions event publishing the JSON to Github Pages? 🤓
This would be fully transparent, and easily maintained through PRs 🙂 It would also allow for manual updates of the JSON file through PRs temporarily if our scraper stops working.
We could create a subdomain under getpat.io that is routes to this GH Pages repo with free ssl certificate provided by github, as we have done with getpat.io.
If you want to have a go at it @xylo04, I can create the repo and give you admin rights.
Btw, are Github Actions free? 🤔
Perfect plan! Yes, I'll try putting it together.
GitHub Actions are free as long as the repo is public, which we will be.
Superb! 😄
I've created the repo (pat-api) and created an orphan branch ghpages
for the actual Github Pages source. I also added an example file that we can target for updates by the scaper. curl https://la5nta.github.io/pat-api/v1/forms/standard-templates/latest
.
We can use the main
branch for the actual scraper code and Actions config.
The currently hardcoded URL Pat scrapes to get the latest Forms templates, is bound to fail at some point. (Spoiler: It's a winlink.org blog post 😨 ).
How about moving this code to a small cloud-hosted service and expose a JSON API that Pat instances can poll to get this information? Then we could easily update the service on short notice when the current approach finally stops working.
Example API:
We could also consider hosting this as a static file/object (on github pages, AWS S3 or similar), and periodically scrape winlink.org and update the object using cron on any machine available to us. Hosting it this way would give us free fault tolerance and a very efficient cache.
Ping @xylo04 @rainerg2000