rpm-software-management / fedora-distro-aliases

Aliases for active Fedora releases
5 stars 6 forks source link

Fallback for when Bodhi API is down #11

Open FrostyX opened 7 months ago

FrostyX commented 7 months ago

When critical tools and services such as Mock, Packit, and Copr start using this package, it will be important for them to not fail when Bodhi API is not available. Possible solutions:

praiskup commented 7 months ago

Packit defines a static dictionary to fallback on but I would like to avoid that at all costs (It IMHO defeats the purpose if this package)

I agree, in a sense that it takes time to ship updated data through RPM repositories. But still, even doing this "fallback" in a single place would a big win, better than implementing fallbacks in mock, copr, packit, ...

praiskup commented 7 months ago

@praiskup suggested having multiple sources of truth besides Bodhi and falling back to them

I rather meant that we (this package) could be the ultimate source of truth, not calling bodhi API at all - but the data would be synced (automatically) with bodhi. And to have the additional "reliability", we could distribute the data to multiple places (e.g. several git forges, copr-be backend, koji).

FrostyX commented 7 months ago

But still, even doing this "fallback" in a single place would a big win, better than implementing fallbacks in mock, copr, packit, ...

I agree. But the whole motivation for creating this project was "We are annoyed by updating our Tito releasers every half a year", I'd like to avoid going back to that approach. Even though only in one project.

However, I see a technical issue as well. A fallback to a manually defined directory is easy for a service like Packit because they can simply update their config and the change is live. In our case, we do the change and then we need to do a release, get the update to the repositories and get it to the user machines. The timing would be impossible. We do the release too early, and there will be a distribution that doesn't exist yet, we do it too late to be safe and we have a large time period without the latest distribution. Additionally, we cannot know who uses the package and how often they update it.

I rather meant that we (this package) could be the ultimate source of truth, not calling bodhi API at all - but the data would be synced (automatically) with bodhi. And to have the additional "reliability", we could distribute the data to multiple places (e.g. several git forges, copr-be backend, koji).

The first sentence confuses me. What do you mean by "not calling bodhi API at all - but the data would be synced (automatically) with bodhi"?

And to have the additional "reliability", we could distribute the data to multiple places (e.g. several git forges, copr-be backend, koji).

I still think caching would be better because that would work even in cases when the client is temporarily offline.

FrostyX commented 7 months ago

@FrostyX suggested caching the results (can be in this package, can be on the side of the caller) and returning the

I'd like to elaborate on the caching idea. There are two approaches that I am thinking about. Not sure which is better.

  1. We could add caching=True to our get_distro_aliases(). That would allow users to optionally disable it. If enabled, the function would dump ~/.cache/fedora-distro-aliases/cache.json if successful. If the consequent function call fails to get the releases = bodhi_active_releases(), they would be parsed from the cache
  2. We could make the caching more explicit and provide something like save_cache(releases) and load_cache(). The get_distro_aliases() would fail with a predictable exception if Bodhi API doesn't work. And it would be up to the user to save the cache and try-except the call to fallback on the cache.

Also, we could timestamp the cache and refuse to use it, if it isn't reasonably recent.

The only tricky case, as @praiskup pointed out, would be Copr builders which are always fresh. If the Bodhi API outages are short (seconds / minutes) we could execute bodhi_active_releases() (or ideally the CLI tool proposed in #8) from our builder provisioning playbook. Therefore be sure that every builder that is up has the cache available. Worst case scenario if Bodhi API outages are long (hours / days), we can create the cache on copr-backend and distribute it to the builders in the provisioning playbook.

praiskup commented 7 months ago

The first sentence confuses me. What do you mean by "not calling bodhi API at all - but the data would be synced (automatically) with bodhi"?

Some cronjob somewhere synchronizes the data we provide automatically with Bodhi, but the package/api itself doesn't.

xsuchy commented 7 months ago

I had just discussion with @humaton and he told be that Bodhi API should be rock stable. He did not seen issues for ages. The only problem that Packit could see was likely cause by infra proxies. Anyway, if it happen, we should let him (or infra know) and they will fix it with top priorites.

xsuchy commented 7 months ago

I set up mtg on Monday where we can discuss the solution.

I believe we can setup the workflow that distro-aliases will use aggressive caching. And for Mock and Copr builders - if the request fails with error, distr-aliases will raise and Exception.

In Mock - we will basically use only the translation of 'rawhide' to 'version_number'. When the request to Bodhi fails and the cache does not exist then: 1) local users running can run something like mock --rawhide=41 and mock will use it during handling exception from distro-aliases. This will allow user to use override when they know better. 2) for Copr builders this is not solution. But I would not mind failing rawhide builders when Bodhi API is down. Or we can populate the cache during creation of builder images.