actions / runner-images

GitHub Actions runner images
MIT License
10.11k stars 3.04k forks source link

Build fails on issue in pypy.sh downloading compressed file #1066

Closed tkwaks closed 4 years ago

tkwaks commented 4 years ago

Hi,

our build is failing after the last commit 330e62a on this file. /virtual-environments/images/linux/scripts/installers/pypy.sh

After some investigation, I found a problem in line 76

In the pypy.sh install bash script there is an issue downloading the https://downloads.python.org/pypy/ file. The server responds by giving a compressed gzip file.

on line 76 the code states: pypyVersions="$(cat /tmp/pypyUrls.html | grep 'linux64' | awk -v uri="$uri" -F'>|<' '{print uri$5}')"

the grep command receives the compressed gzip file through the standard input from the cat command standard out. This can not work as the grep command cannot find any pattern of linux64 in the compressed input.

line 76 should be changed to decompress the /tmp/pypyUrls.html file and pipe its outcome to the grep command, like: pypyVersions="$(gzip -cd /tmp/pypyUrls.html | grep 'linux64' | awk -v uri="$uri" -F'>|<' '{print uri$5}')"

I hope this can be fixed as this brakes our releases of new build agents.

Best regards, Thomas Kwaks

vsafonkin commented 4 years ago

Hi @tkwaks , thank you for your report! I've checked this code and it works correctly: https://github.com/vsafonkin/check-pypy-versions/runs/780207913?check_suite_focus=true

We get html content from pypyUrls.html that contains urls for downloading. Cat command sends simple html text to standard output that grep handles correctly.

tkwaks commented 4 years ago

We still have the same issue:

image

running your code on ubuntu 18.04 I see this issue still occurring image

image

DiederickA commented 4 years ago

In this commit the code to fetch this data changed from curl -4 -s --compressed https://downloads.python.org/pypy/ to wget https://downloads.python.org/pypy/ --output-document=pypy.html --no-verbose

Running these in a Ubuntu 20.04 environment gives different results: Diederick$ curl -4 -s --compressed https://downloads.python.org/pypy/

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
...

Diederick$ wget https://downloads.python.org/pypy/ --output-document=pypyUrls.html --no-verbose 2020-06-18 08:11:04 URL:https://downloads.python.org/pypy/ [12513/12513] -> "pypyUrls.html" [1] Diederick$ file pypyUrls.html pypy.html: gzip compressed data, from Unix, original size modulo 2^32 167824

It seems you get different results. Can this be platform dependent?

vsafonkin commented 4 years ago

@tkwaks , I'm really confused, on GitHub Actions this code works correctly:

wget https://downloads.python.org/pypy/ --output-document=pypyUrls.html --no-verbose
file pypyUrls.html
cat pypyUrls.html

https://github.com/vsafonkin/check-pypy-versions/runs/783432659?check_suite_focus=true

I've tested it on Azure DevOps and it works too. I've tested it on Ubuntu 18.04 and 20.04 Could you please check your wget version? I believe the problem may be in the default settings for different wget versions.

DiederickA commented 4 years ago

I would expect that it works for you, therefore also my confusion and suggestion it is platform dependent. I am running this in WSL 2, Ubuntu 20.04.

What is the machine you are using to run the Packer build in?

Here my environment:

Diederick$ wget --version
GNU Wget 1.20.3 built on linux-gnu.

-cares +digest -gpgme +https +ipv6 +iri +large-file -metalink +nls
+ntlm +opie +psl +ssl/openssl

Wgetrc:
    /etc/wgetrc (system)
Locale:
    /usr/share/locale
Compile:
    gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC="/etc/wgetrc"
    -DLOCALEDIR="/usr/share/locale" -I. -I../../src -I../lib
    -I../../lib -Wdate-time -D_FORTIFY_SOURCE=2 -DHAVE_LIBSSL -DNDEBUG
    -g -O2 -fdebug-prefix-map=/build/wget-OYIfr9/wget-1.20.3=.
    -fstack-protector-strong -Wformat -Werror=format-security
    -DNO_SSLv2 -D_FILE_OFFSET_BITS=64 -g -Wall
Link:
    gcc -DHAVE_LIBSSL -DNDEBUG -g -O2
    -fdebug-prefix-map=/build/wget-OYIfr9/wget-1.20.3=.
    -fstack-protector-strong -Wformat -Werror=format-security
    -DNO_SSLv2 -D_FILE_OFFSET_BITS=64 -g -Wall -Wl,-Bsymbolic-functions
    -Wl,-z,relro -Wl,-z,now -lpcre2-8 -luuid -lidn2 -lssl -lcrypto -lz
    -lpsl ftp-opie.o openssl.o http-ntlm.o ../lib/libgnu.a

Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://www.gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Originally written by Hrvoje Niksic <hniksic@xemacs.org>.
Please send bug reports and questions to <bug-wget@gnu.org>.

Diederick$ cat /etc/wgetrc | grep -v "^#" | grep -v "^$"
passive_ftp = on
vsafonkin commented 4 years ago

@DiederickA , could you please try to execute this code?

wget https://downloads.python.org/pypy/ --header 'Accept-encoding: identity' --output-document=pypyUrls.html --no-verbose
file pypyUrls.html
cat pypyUrls.html

I've tested wget 1.19.4 and 1.20.3 and it works.

DiederickA commented 4 years ago

That does not seem to make a difference:

Diederick$ wget https://downloads.python.org/pypy/ --header 'Accept-encoding: identity' --output-document=pypyUrls.html --no-verbose
2020-06-19 09:15:05 URL:https://downloads.python.org/pypy/ [12513/12513] -> "pypyUrls.html" [1]
Diederick$ file pypyUrls.html
pypyUrls.html: gzip compressed data, from Unix, original size modulo 2^32 167824
Diederick$ more pypyUrls.html
�
u���~�?�yt������������l|~�?����?�����㿎
tkwaks commented 4 years ago

u501@u501-Virtual-Machine:~$ wget https://downloads.python.org/pypy/ --header 'Accept-encoding: identity' --output-document=pypyUrls.html --no-verbose -d DEBUG output created by Wget 1.20.3 on linux-gnu.

Reading HSTS entries from /home/u501/.wget-hsts URI encoding = ‘UTF-8’ Caching downloads.python.org => 151.101.36.175 2a04:4e42:9::175 Created socket 4. Releasing 0x0000561225b07a10 (new refcount 1). Initiating SSL handshake. Handshake successful; connected socket 4 to SSL handle 0x0000561225b07c20 certificate: subject: CN=*.c.ssl.fastly.net,O=Fastly\, Inc.,L=San Francisco,ST=California,C=US issuer: CN=GlobalSign CloudSSL CA - SHA256 - G3,O=GlobalSign nv-sa,C=BE X509 certificate successfully verified and matches host downloads.python.org

---request begin--- GET /pypy/ HTTP/1.1 User-Agent: Wget/1.20.3 (linux-gnu) Accept: / Accept-encoding: identity Host: downloads.python.org Connection: Keep-Alive

---request end---

---response begin--- HTTP/1.1 200 OK Connection: keep-alive Content-Length: 12513 Server: nginx/1.10.3 (Ubuntu) Content-Type: text/html; charset=utf-8 Content-Encoding: gzip Via: 1.1 varnish Cache-Control: max-age=365000000, immutable, public Accept-Ranges: bytes Date: Fri, 19 Jun 2020 08:08:06 GMT Via: 1.1 varnish Age: 2670135 X-Served-By: cache-fra19144-FRA, cache-ams21072-AMS X-Cache: HIT, HIT X-Cache-Hits: 1, 1 X-Timer: S1592554087.964050,VS0,VE6 Strict-Transport-Security: max-age=31557600

---response end--- Registered socket 4 for persistent reuse. Parsed Strict-Transport-Security max-age = 31557600, includeSubDomains = false Updated HSTS host: downloads.python.org:443 (max-age: 31557600, includeSubdomains: false) URI content encoding = ‘utf-8’ 2020-06-19 10:08:06 URL:https://downloads.python.org/pypy/ [12513/12513] -> "pypyUrls.html" [1] Saving HSTS entries to /home/u501/.wget-hsts

tkwaks commented 4 years ago

Hi @vsafonkin,

Could you please add the following code on line 76 (after the download_with_retries) or something similar. file -s /tmp/pypyUrls.html | grep gzip && mv /tmp/pypyUrls.html /tmp/pypyUrls.html.gz && gunzip /tmp/pypyUrls.html.gz

This will ensure that our build will work again and yours will not break. (I know this is a kludge ;-) )

vsafonkin commented 4 years ago

@tkwaks , @DiederickA , I've created PR to use curl instead wget, it should solve the problem.

DiederickA commented 4 years ago

That is indeed less of a kludge than the suggested solution, so (for a part) reverting back to previous setup. Thanks for looking into this and helping out.

vsafonkin commented 4 years ago

@DiederickA , thank you for your help!

vsafonkin commented 4 years ago

@tkwaks , pull request was merged, please, check your build.

DiederickA commented 4 years ago

I'm afraid the curl command needs the --compressed flag. As it was before in pypy.sh : result="$(curl -4 -s --compressed $uri | grep 'linux64' | awk -v uri="$uri" -F'>|<' '{print uri$5}')" Testing in bash directly:

Diederick$ curl https://downloads.python.org/pypy/ --compressed -4 -s -o pypyUrls.html
Diederick$ ls -l pypyUrls.html
-rwxrwxrwx 1 died died 167824 Jun 19 21:42 pypyUrls.html
Diederick$ file  pypyUrls.html
pypyUrls.html: HTML document, ASCII text

Diederick$ curl https://downloads.python.org/pypy/ -4 -s -o pypyUrls.html
Diederick$ file  pypyUrls.html
pypyUrls.html: gzip compressed data, from Unix, original size modulo 2^32 167824
Diederick$ ls -l pypyUrls.html
-rwxrwxrwx 1 died died 12513 Jun 19 21:43 pypyUrls.html
vsafonkin commented 4 years ago

@DiederickA , fixed, please check it.

tkwaks commented 4 years ago

Hi @vsafonkin,

Our build is still failing and I believe that I found the reason.

In the file /images/linux/scripts/helpers/install.sh on line 19 the code states: curl $URL -4 -s -compressed -o "$DEST/$NAME" The code should be: curl $URL -4 -s --compressed -o "$DEST/$NAME" image

Cloud you please change the code?

DiederickA commented 4 years ago

The builds have failed in fetching the tar.bz file.

I believe we have now the issue that your curl command needs the --compressed for the fetch of the html page. However, it needs to be without it for the download of the tar.bz.

Diederick$ curl https://downloads.python.org/pypy/pypy2.7-v7.3.1-linux64.tar.bz2 -4 --compressed -o pypy2.7-v7.3.1-linux64.tar.bz2
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0 30.4M    0  1371    0     0  19585      0  0:27:09 --:--:--  0:27:09 19869
curl: (61) Unrecognized content encoding type. libcurl understands deflate, gzip, br content encodings.
Diederick$ echo $?
61
Diederick$ curl https://downloads.python.org/pypy/pypy2.7-v7.3.1-linux64.tar.bz2 -4 -o pypy2.7-v7.3.1-linux64.tar.bz2
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 30.4M  100 30.4M    0     0  9785k      0  0:00:03  0:00:03 --:--:-- 9788k
Diederick$ echo $?
0
Diederick$ file pypy2.7-v7.3.1-linux64.tar.bz2
pypy2.7-v7.3.1-linux64.tar.bz2: bzip2 compressed data, block size = 900k

I think you should either go back to the original set up (two different downloads) or add an extra argument to the helper function to enable passing on extra flags.

vsafonkin commented 4 years ago

Hi @DiederickA , @tkwaks , PR with fix was merged, could you please check your builds?

DiederickA commented 4 years ago

Thanks @vsafonkin this now works again. This call may be closed.