Closed tkwaks closed 4 years ago
Hi @tkwaks , thank you for your report! I've checked this code and it works correctly: https://github.com/vsafonkin/check-pypy-versions/runs/780207913?check_suite_focus=true
We get html content from pypyUrls.html
that contains urls for downloading. Cat
command sends simple html text to standard output that grep
handles correctly.
We still have the same issue:
running your code on ubuntu 18.04 I see this issue still occurring
In this commit the code to fetch this data changed from
curl -4 -s --compressed https://downloads.python.org/pypy/
to
wget https://downloads.python.org/pypy/ --output-document=pypy.html --no-verbose
Running these in a Ubuntu 20.04 environment gives different results: Diederick$ curl -4 -s --compressed https://downloads.python.org/pypy/
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
...
Diederick$ wget https://downloads.python.org/pypy/ --output-document=pypyUrls.html --no-verbose 2020-06-18 08:11:04 URL:https://downloads.python.org/pypy/ [12513/12513] -> "pypyUrls.html" [1] Diederick$ file pypyUrls.html pypy.html: gzip compressed data, from Unix, original size modulo 2^32 167824
It seems you get different results. Can this be platform dependent?
@tkwaks , I'm really confused, on GitHub Actions this code works correctly:
wget https://downloads.python.org/pypy/ --output-document=pypyUrls.html --no-verbose
file pypyUrls.html
cat pypyUrls.html
https://github.com/vsafonkin/check-pypy-versions/runs/783432659?check_suite_focus=true
I've tested it on Azure DevOps and it works too. I've tested it on Ubuntu 18.04 and 20.04 Could you please check your wget version? I believe the problem may be in the default settings for different wget versions.
I would expect that it works for you, therefore also my confusion and suggestion it is platform dependent. I am running this in WSL 2, Ubuntu 20.04.
What is the machine you are using to run the Packer build in?
Here my environment:
Diederick$ wget --version
GNU Wget 1.20.3 built on linux-gnu.
-cares +digest -gpgme +https +ipv6 +iri +large-file -metalink +nls
+ntlm +opie +psl +ssl/openssl
Wgetrc:
/etc/wgetrc (system)
Locale:
/usr/share/locale
Compile:
gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC="/etc/wgetrc"
-DLOCALEDIR="/usr/share/locale" -I. -I../../src -I../lib
-I../../lib -Wdate-time -D_FORTIFY_SOURCE=2 -DHAVE_LIBSSL -DNDEBUG
-g -O2 -fdebug-prefix-map=/build/wget-OYIfr9/wget-1.20.3=.
-fstack-protector-strong -Wformat -Werror=format-security
-DNO_SSLv2 -D_FILE_OFFSET_BITS=64 -g -Wall
Link:
gcc -DHAVE_LIBSSL -DNDEBUG -g -O2
-fdebug-prefix-map=/build/wget-OYIfr9/wget-1.20.3=.
-fstack-protector-strong -Wformat -Werror=format-security
-DNO_SSLv2 -D_FILE_OFFSET_BITS=64 -g -Wall -Wl,-Bsymbolic-functions
-Wl,-z,relro -Wl,-z,now -lpcre2-8 -luuid -lidn2 -lssl -lcrypto -lz
-lpsl ftp-opie.o openssl.o http-ntlm.o ../lib/libgnu.a
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://www.gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Originally written by Hrvoje Niksic <hniksic@xemacs.org>.
Please send bug reports and questions to <bug-wget@gnu.org>.
Diederick$ cat /etc/wgetrc | grep -v "^#" | grep -v "^$"
passive_ftp = on
@DiederickA , could you please try to execute this code?
wget https://downloads.python.org/pypy/ --header 'Accept-encoding: identity' --output-document=pypyUrls.html --no-verbose
file pypyUrls.html
cat pypyUrls.html
I've tested wget 1.19.4 and 1.20.3 and it works.
That does not seem to make a difference:
Diederick$ wget https://downloads.python.org/pypy/ --header 'Accept-encoding: identity' --output-document=pypyUrls.html --no-verbose
2020-06-19 09:15:05 URL:https://downloads.python.org/pypy/ [12513/12513] -> "pypyUrls.html" [1]
Diederick$ file pypyUrls.html
pypyUrls.html: gzip compressed data, from Unix, original size modulo 2^32 167824
Diederick$ more pypyUrls.html
�
u���~�?�yt������������l|~�?����?�����㿎
u501@u501-Virtual-Machine:~$ wget https://downloads.python.org/pypy/ --header 'Accept-encoding: identity' --output-document=pypyUrls.html --no-verbose -d DEBUG output created by Wget 1.20.3 on linux-gnu.
Reading HSTS entries from /home/u501/.wget-hsts URI encoding = ‘UTF-8’ Caching downloads.python.org => 151.101.36.175 2a04:4e42:9::175 Created socket 4. Releasing 0x0000561225b07a10 (new refcount 1). Initiating SSL handshake. Handshake successful; connected socket 4 to SSL handle 0x0000561225b07c20 certificate: subject: CN=*.c.ssl.fastly.net,O=Fastly\, Inc.,L=San Francisco,ST=California,C=US issuer: CN=GlobalSign CloudSSL CA - SHA256 - G3,O=GlobalSign nv-sa,C=BE X509 certificate successfully verified and matches host downloads.python.org
---request begin--- GET /pypy/ HTTP/1.1 User-Agent: Wget/1.20.3 (linux-gnu) Accept: / Accept-encoding: identity Host: downloads.python.org Connection: Keep-Alive
---request end---
---response begin--- HTTP/1.1 200 OK Connection: keep-alive Content-Length: 12513 Server: nginx/1.10.3 (Ubuntu) Content-Type: text/html; charset=utf-8 Content-Encoding: gzip Via: 1.1 varnish Cache-Control: max-age=365000000, immutable, public Accept-Ranges: bytes Date: Fri, 19 Jun 2020 08:08:06 GMT Via: 1.1 varnish Age: 2670135 X-Served-By: cache-fra19144-FRA, cache-ams21072-AMS X-Cache: HIT, HIT X-Cache-Hits: 1, 1 X-Timer: S1592554087.964050,VS0,VE6 Strict-Transport-Security: max-age=31557600
---response end--- Registered socket 4 for persistent reuse. Parsed Strict-Transport-Security max-age = 31557600, includeSubDomains = false Updated HSTS host: downloads.python.org:443 (max-age: 31557600, includeSubdomains: false) URI content encoding = ‘utf-8’ 2020-06-19 10:08:06 URL:https://downloads.python.org/pypy/ [12513/12513] -> "pypyUrls.html" [1] Saving HSTS entries to /home/u501/.wget-hsts
Hi @vsafonkin,
Could you please add the following code on line 76 (after the download_with_retries) or something similar.
file -s /tmp/pypyUrls.html | grep gzip && mv /tmp/pypyUrls.html /tmp/pypyUrls.html.gz && gunzip /tmp/pypyUrls.html.gz
This will ensure that our build will work again and yours will not break. (I know this is a kludge ;-) )
@tkwaks , @DiederickA , I've created PR to use curl instead wget, it should solve the problem.
That is indeed less of a kludge than the suggested solution, so (for a part) reverting back to previous setup. Thanks for looking into this and helping out.
@DiederickA , thank you for your help!
@tkwaks , pull request was merged, please, check your build.
I'm afraid the curl command needs the --compressed flag. As it was before in pypy.sh :
result="$(curl -4 -s --compressed $uri | grep 'linux64' | awk -v uri="$uri" -F'>|<' '{print uri$5}')"
Testing in bash directly:
Diederick$ curl https://downloads.python.org/pypy/ --compressed -4 -s -o pypyUrls.html
Diederick$ ls -l pypyUrls.html
-rwxrwxrwx 1 died died 167824 Jun 19 21:42 pypyUrls.html
Diederick$ file pypyUrls.html
pypyUrls.html: HTML document, ASCII text
Diederick$ curl https://downloads.python.org/pypy/ -4 -s -o pypyUrls.html
Diederick$ file pypyUrls.html
pypyUrls.html: gzip compressed data, from Unix, original size modulo 2^32 167824
Diederick$ ls -l pypyUrls.html
-rwxrwxrwx 1 died died 12513 Jun 19 21:43 pypyUrls.html
@DiederickA , fixed, please check it.
Hi @vsafonkin,
Our build is still failing and I believe that I found the reason.
In the file /images/linux/scripts/helpers/install.sh on line 19 the code states: curl $URL -4 -s -compressed -o "$DEST/$NAME" The code should be: curl $URL -4 -s --compressed -o "$DEST/$NAME"
Cloud you please change the code?
The builds have failed in fetching the tar.bz file.
I believe we have now the issue that your curl command needs the --compressed for the fetch of the html page. However, it needs to be without it for the download of the tar.bz.
Diederick$ curl https://downloads.python.org/pypy/pypy2.7-v7.3.1-linux64.tar.bz2 -4 --compressed -o pypy2.7-v7.3.1-linux64.tar.bz2
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 30.4M 0 1371 0 0 19585 0 0:27:09 --:--:-- 0:27:09 19869
curl: (61) Unrecognized content encoding type. libcurl understands deflate, gzip, br content encodings.
Diederick$ echo $?
61
Diederick$ curl https://downloads.python.org/pypy/pypy2.7-v7.3.1-linux64.tar.bz2 -4 -o pypy2.7-v7.3.1-linux64.tar.bz2
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 30.4M 100 30.4M 0 0 9785k 0 0:00:03 0:00:03 --:--:-- 9788k
Diederick$ echo $?
0
Diederick$ file pypy2.7-v7.3.1-linux64.tar.bz2
pypy2.7-v7.3.1-linux64.tar.bz2: bzip2 compressed data, block size = 900k
I think you should either go back to the original set up (two different downloads) or add an extra argument to the helper function to enable passing on extra flags.
Hi @DiederickA , @tkwaks , PR with fix was merged, could you please check your builds?
Thanks @vsafonkin this now works again. This call may be closed.
Hi,
our build is failing after the last commit 330e62a on this file. /virtual-environments/images/linux/scripts/installers/pypy.sh
After some investigation, I found a problem in line 76
In the pypy.sh install bash script there is an issue downloading the https://downloads.python.org/pypy/ file. The server responds by giving a compressed gzip file.
on line 76 the code states: pypyVersions="$(cat /tmp/pypyUrls.html | grep 'linux64' | awk -v uri="$uri" -F'>|<' '{print uri$5}')"
the grep command receives the compressed gzip file through the standard input from the cat command standard out. This can not work as the grep command cannot find any pattern of linux64 in the compressed input.
line 76 should be changed to decompress the /tmp/pypyUrls.html file and pipe its outcome to the grep command, like: pypyVersions="$(gzip -cd /tmp/pypyUrls.html | grep 'linux64' | awk -v uri="$uri" -F'>|<' '{print uri$5}')"
I hope this can be fixed as this brakes our releases of new build agents.
Best regards, Thomas Kwaks