QubesOS / qubes-issues

The Qubes OS Project issue tracker
https://www.qubes-os.org/doc/issue-tracking/
534 stars 46 forks source link

Builder v2: Automatically retry failed downloads and Git fetches #9232

Open DemiMarie opened 4 months ago

DemiMarie commented 4 months ago

How to file a helpful issue

Qubes OS release

R4.2 but the builder is release-agnostic.

Brief summary

builderv2 doesn’t retry downloads or git fetches. When the network is unreliable (which Tor is), this makes it very unlikely that an operation that involves many network calls will complete successfully. A simple example is ./qb package fetch, which implicitly fetches all packages.

Steps to reproduce

./qb package fetch over Tor.

Expected behavior

Individual operations fail, but the failing operations are retried, and eventually the command succeeds.

Actual behavior

Command fails after the first failing operation.

marmarek commented 4 months ago

Retrying fetch might be useful sometimes, but it isn't really important - it's an early stage, and retrying the call doesn't really need to redo work. OTOH, implementing this would require distinguishing the reason for the failure (temporary network issue, config error like wrong URL or component name, missing/wrong signature etc). Retrying on non-network errors would actually make things significantly worse.

DemiMarie commented 4 months ago

Retrying fetch might be useful sometimes, but it isn't really important - it's an early stage, and retrying the call doesn't really need to redo work.

The main problem is when one is fetching many components and only one of them fails. In this case, retrying the call will unnecessarily refetch the successfully-fetched components too, and this refetch might itself fail. This both wastes effort and decreases reliability.

If the probability of a given call succeeding is X, and assuming that the success or failure of each fetch is independent and identically distributed, then the probability that fetching N components will all succeed is X-N, so a much larger number of retries will be needed than if each component was fetched separately.

marmarek commented 4 months ago

Well, if you are about just doing fresh fetch, you can set skip-git-fetch: true and it will skip already fetched components. But if you want to update already fetched components too, then yes, it will attempt the already done too. But it will only check for update, as all the objects will be fetched already for those that succeeded earlier.

Anyway, the fetch retry feature request is valid, just not very important.