rockdaboot / libpsl

C library for the Public Suffix List
https://rockdaboot.github.io/libpsl
MIT License
171 stars 70 forks source link

libpsl build from source code fails due to missing public_suffix_list.dat, but if download it fails with File "/usr/lib64/python3.6/encodings/idna.py", line 165, in encode raise UnicodeError("label too long") #247

Open phil2k opened 2 months ago

phil2k commented 2 months ago

Hi, I've tried to build libpsl within an UBI8 container, and I got a lot of issues, starting with dependencies like:

  1. during ./autogen.sh (or autoreconf -fi), is missing autopoint binary (which should be part of gettext package, but on UBI8 is not in that package, and there is no gettext-devel alternative) => I had to download the gettext source package and built id too :( ); πŸ‘Ž
  2. requested a dependency of python => I've installed python3 pkg (python3.6);
  3. after finally had ./autogen.sh & ./configure working, then the following error appeared at make command: πŸ‘Ž
    make[2]: Entering directory '/usr/src/libpsl.0/src'
    make[2]: *** No rule to make target '../list/public_suffix_list.dat', needed by 'suffixes_dafsa.h'.  Stop.
    make[2]: Leaving directory '/usr/src/libpsl.0/src'
    make[1]: *** [Makefile:537: all-recursive] Error 1
    make[1]: Leaving directory '/usr/src/libpsl.0'
    make: *** [Makefile:446: all] Error 2

Then I looked into the source code of Makefile, and noticed this declaration: PSL_FILE = $(top_srcdir)/list/public_suffix_list.dat, based on which I saw that in the ./list/ folder was no public_suffix_list.dat.

Ok, I've read the README.md again, and I noticed that I have to download that file from https://github.com/publicsuffix/list/blob/master/public_suffix_list.dat , which I did by running the following command: curl -sSL https://github.com/publicsuffix/list/blob/master/public_suffix_list.dat > ./list/public_suffix_list.dat but then the make command failed with: πŸ‘Ž

/usr/bin/python3 ./psl-make-dafsa --output-format=cxx+ "../list/public_suffix_list.dat" suffixes_dafsa.h
Traceback (most recent call last):
  File "/usr/lib64/python3.6/encodings/idna.py", line 165, in encode
    raise UnicodeError("label too long")
UnicodeError: label too long

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "./psl-make-dafsa", line 692, in <module>
    sys.exit(main())
  File "./psl-make-dafsa", line 686, in main
    outfile.write(converter(parser(infile, utf_mode, codecs), utf_mode, codecs))
  File "./psl-make-dafsa", line 591, in parse_psl
    punycode = line.decode('utf-8').encode('idna')
UnicodeError: encoding with 'idna' codec failed (UnicodeError: label too long)
make[2]: *** [Makefile:854: suffixes_dafsa.h] Error 1
make[2]: Leaving directory '/usr/src/libpsl.0/src'
make[1]: *** [Makefile:537: all-recursive] Error 1
make[1]: Leaving directory '/usr/src/libpsl.0'
make: *** [Makefile:446: all] Error 2

Ok, I said, let's do what's written in README.md and run the command: ./src/psl-make-dafsa --output-format=binary list/public_suffix_list.dat psl.dafsa , but it failed with the same error as above: πŸ‘Ž

raceback (most recent call last):
  File "/usr/lib64/python3.6/encodings/idna.py", line 165, in encode
    raise UnicodeError("label too long")
UnicodeError: label too long

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "src/psl-make-dafsa", line 692, in <module>
    sys.exit(main())
  File "src/psl-make-dafsa", line 686, in main
    outfile.write(converter(parser(infile, utf_mode, codecs), utf_mode, codecs))
  File "src/psl-make-dafsa", line 591, in parse_psl
    punycode = line.decode('utf-8').encode('idna')
UnicodeError: encoding with 'idna' codec failed (UnicodeError: label too long)

Ok, I said maybe that script src/psl-make-dafsa is quite outdated (since I saw mentioned "Copyright 2014 The Chromium Authors...", but nothing else like a version, a release date, etc.) πŸ‘Ž

I've tried to search on all the Internet about this script if I could get it (maybe there's a new version of it), but I found this version for EL8 (which should be compatible with UBI8): https://yum.oracle.com/repo/OracleLinux/OL8/distro/builder/x86_64/getPackage/psl-make-dafsa-0.20.2-6.el8.x86_64.rpm , but after I've installed it and runt the command psl-make-dafsa --output-format=binary list/public_suffix_list.dat psl.dafsa , I got the exact same error:πŸ‘Ž

Traceback (most recent call last):
  File "/usr/lib64/python3.6/encodings/idna.py", line 165, in encode
    raise UnicodeError("label too long")
UnicodeError: label too long

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/bin/psl-make-dafsa", line 693, in <module>
    sys.exit(main())
  File "/usr/bin/psl-make-dafsa", line 687, in main
    outfile.write(converter(parser(infile, utf_mode, codecs), utf_mode, codecs))
  File "/usr/bin/psl-make-dafsa", line 592, in parse_psl
    punycode = line.decode('utf-8').encode('idna')
UnicodeError: encoding with 'idna' codec failed (UnicodeError: label too long)

But in case I'm doing something wrong, could you please guide me insto installing compiling and installing this library from the source code ?

PS: Everything started when I wanted to build the latest version of curl & libcurl (8.10.1) from the source code, which now requires libpsl (which looks like is no more optional like in lower versions like 8.7.0).

eli-schwartz commented 2 months ago

I've tried to build libpsl within an UBI8 container, and I got a lot of issues, starting with dependencies like:

  • during ./autogen.sh (or autoreconf -fi), is missing autopoint binary (which should be part of gettext package, but on UBI8 is not in that package, and there is no gettext-devel alternative) => I had to download the gettext source package and built id too :( ); πŸ‘Ž

You should NOT be running autogen.sh or autoreconf -fi unless you are prepared to chase down weird dependencies that only matter to libpsl project maintainers. The release tarball at https://github.com/rockdaboot/libpsl/releases/tag/0.21.5 does not need to run autogen.sh or autoreconf -fi.

  • requested a dependency of python => I've installed python3 pkg (python3.6);

Sure, this sounds like it was easy for you to handle.

  • after finally had ./autogen.sh & ./configure working, then the following error appeared at make command: πŸ‘Ž

The make error occurred because you used a git clone but did not follow the complete instructions in README.md:

Building from git

Download project with

git clone --recursive https://github.com/rockdaboot/libpsl

The "recursive" is important, as it correctly downloads the public suffix list for you.

Ok, I've read the README.md again, and I noticed that I have to download that file from https://github.com/publicsuffix/list/blob/master/public_suffix_list.dat , which I did by running the following command: curl -sSL https://github.com/publicsuffix/list/blob/master/public_suffix_list.dat > ./list/public_suffix_list.dat

... but yes, you should be able to download a brand new copy with curl. However, please note that GitHub itself is offering you an HTML file, not a public suffix list. The HTML file contains the GitHub UI with an embeddable syntax-highlighted and line-delimited GUI view of the contents of the actual public suffix list.

To download a file from github, please visit it in your browser and locate the "Raw" or "download raw file" buttons in the top right corner of the GitHub UI. It will offer you a link with the word "blob" in the URL replaced by "raw" -- that link is a valid download link for the raw file instead of useless HTML.

(libpsl doesn't suggest using curl on that url, it simply links you to the webpage for the public suffix list.)

But in case I'm doing something wrong, could you please guide me insto installing compiling and installing this library from the source code ?

Ultimately the best way is to use the official release tarball of libpsl, not the raw development sources. Although with a bit of effort (installing dependencies like gettext, manually acquiring the public suffix list) you can use the raw development sources too. :)