fosslinux / live-bootstrap

Use of a Linux initramfs to fully automate the bootstrapping process
482 stars 32 forks source link

Downloads often fail inexplicably #363

Closed lrvick closed 8 months ago

lrvick commented 8 months ago

I have been trying to get the chroot process to work, and among other issues, when cleaning and starting from 0, downloads often fail repeatedly, but only in the python downloader, and not wget.

Example:

./rootfs.py --chroot

Bootstrapping x86 -- SysA
Downloading: https://download.savannah.gnu.org/releases/nyacc/nyacc-1.00.2.tar.gz
Traceback (most recent call last):
File "/home/lrvick/Sources/live-bootstrap/./rootfs.py", line 265, in <module>
main()
File "/home/lrvick/Sources/live-bootstrap/./rootfs.py", line 161, in main
bootstrap(args, generator, tmpdir)
File "/home/lrvick/Sources/live-bootstrap/./rootfs.py", line 174, in bootstrap
generator.prepare(using_kernel=False)
File "/home/lrvick/Sources/live-bootstrap/lib/generator.py", line 80, in prepare
self.steps()
File "/home/lrvick/Sources/live-bootstrap/lib/generator.py", line 101, in steps
self.get_packages(source_manifest)
File "/home/lrvick/Sources/live-bootstrap/lib/generator.py", line 304, in get_packages
path = self.download_file(line[2], line[1], line[3])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lrvick/Sources/live-bootstrap/lib/generator.py", line 296, in download_file
raise requests.HTTPError("Download failed.")
requests.exceptions.HTTPError: Download failed.
Removing /home/lrvick/Sources/live-bootstrap/tmp

I adjusted the downloader to log the URL to make manual downloads easier.

I can in fact go to a shell and download it without issue:

wget https://download.savannah.gnu.org/releases/nyacc/nyacc-1.00.2.tar.gz
--2023-12-24 00:34:57--  https://download.savannah.gnu.org/releases/nyacc/nyacc-1.00.2.tar.gz
Resolving download.savannah.gnu.org (download.savannah.gnu.org)... 209.51.188.200, 2001:470:142:5::200
Connecting to download.savannah.gnu.org (download.savannah.gnu.org)|209.51.188.200|:443... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://mirrors.sarata.com/non-gnu/nyacc/nyacc-1.00.2.tar.gz [following]
--2023-12-24 00:34:58--  https://mirrors.sarata.com/non-gnu/nyacc/nyacc-1.00.2.tar.gz
Resolving mirrors.sarata.com (mirrors.sarata.com)... 92.204.136.52
Connecting to mirrors.sarata.com (mirrors.sarata.com)|92.204.136.52|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1265055 (1.2M) [application/x-gzip]
Saving to: ‘nyacc-1.00.2.tar.gz’

nyacc-1.00.2.tar.gz           100%[==============================================>]   1.21M  2.64MB/s    in 0.5s    

2023-12-24 00:34:59 (2.64 MB/s) - ‘nyacc-1.00.2.tar.gz’ saved [1265055/1265055]

Re-running the python downloader, and same issue. No idea what they are doing so differently.

Manually downloading with wget when this happens and moving the file to distfiles is how I am able to make progress at the moment.

Not a hard blocker, but frustrating and wanted to flag it as an issue.

Running Python 3.11.2 from Debian 12 running under QubesOS 4.1

This only seems to happen to downloads on download.savannah.gnu.org

All others work as expected.

Googulator commented 8 months ago

Does setting the header "User-Agent": "curl/7.88.1" in generator.sh solve this for you?

My current running theory is that savannah is throttling downloads with a "requests" user agent, thinking it's a bot. Unfortunately we can't just use a browser-like user agent (e.g. emulating Firefox), because then SourceForge redirects downloads to a splash page.

rick-masters commented 8 months ago

This has been a consistent problem for me for a long time and the solution suggested by @Googulator works for me. PR forthcoming.