Open dontcallmedom opened 2 years ago
so it looks like solving https://github.com/w3c/reffy/issues/850 will gets us with ~1min30 as a basis for a no-update workflow run, and updating one spec is probably in the order of ~10s, so running a full crawl might be reasonable approach to this, although we should expect the basis to grow in proportion of the number of specs being crawled.
For the more efficient single-spec update approach, we might be able to use https://github.com/softprops/turnstyle as a way to ensure trigger events are processed sequentially - see also https://github.community/t/race-condition-possible-from-rapidly-executed-concurrent-github-actions/137411/3
In a variety of contexts (CI in particular, but likely also in the context of the data re-used by spec authoring tools), it would be ideal if the content in webref reflected changes in the underlying documents in close to real-time.
One way we could enable this (at least partially) is by having spec repos trigger a webref update for the given spec whenever the main source file of the said spec is updated - this could be typically achieved with a webhook installed at the repo or (more likely for scaling) at the org level.
One issue is that if several updates are processed at the same time, they would likely trigger an error at the time of pushing the results; this could be avoided either using a different timing in how checkouts and crawls are organized, or by doing a full crawl (with HTTP caching optimizations to reduce the time / network impact).