ArchiveTeam / grab-site

The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
Other
1.31k stars 129 forks source link

Can't build lxml.etree (on macOS) #174

Closed bknowles closed 3 years ago

bknowles commented 3 years ago

In trying to install grab-site according to the instructions at https://github.com/ArchiveTeam/grab-site#install-on-macos I ran into a problem in the final step. Specifically, this error:

    building 'lxml.etree' extension
    creating build/temp.macosx-10.14.6-x86_64-3.8
    creating build/temp.macosx-10.14.6-x86_64-3.8/src
    creating build/temp.macosx-10.14.6-x86_64-3.8/src/lxml
    clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -iwithsysroot/System/Library/Frameworks/System.framework/PrivateHeaders -iwithsysroot/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.8/Headers -arch arm64 -arch x86_64 -DCYTHON_CLINE_IN_TRACEBACK=0 -I/usr/local/Cellar/libxml2/2.9.10_2/include/libxml2 -Isrc -Isrc/lxml/includes -I/Users/bradmin/gs-venv/include -I/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8 -c src/lxml/etree.c -o build/temp.macosx-10.14.6-x86_64-3.8/src/lxml/etree.o -w -flat_namespace
    In file included from src/lxml/etree.c:97:
    In file included from /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8/Python.h:11:
    In file included from /Library/Developer/CommandLineTools/usr/lib/clang/12.0.0/include/limits.h:21:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/limits.h:63:
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/cdefs.h:807:2: error: Unsupported architecture
    #error Unsupported architecture
     ^
    In file included from src/lxml/etree.c:97:
    In file included from /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8/Python.h:11:
    In file included from /Library/Developer/CommandLineTools/usr/lib/clang/12.0.0/include/limits.h:21:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/limits.h:64:
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/machine/limits.h:8:2: error: architecture not supported
    #error architecture not supported
     ^
    In file included from src/lxml/etree.c:97:
    In file included from /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8/Python.h:25:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/stdio.h:64:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_stdio.h:71:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_types.h:27:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:33:
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/machine/_types.h:34:2: error: architecture not supported
    #error architecture not supported
     ^
    In file included from src/lxml/etree.c:97:
    In file included from /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8/Python.h:25:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/stdio.h:64:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_stdio.h:71:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_types.h:27:
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:55:9: error: unknown type name '__int64_t'
    typedef __int64_t       __darwin_blkcnt_t;      /* total blocks */
            ^
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:56:9: error: unknown type name '__int32_t'; did you mean '__int128_t'?
    typedef __int32_t       __darwin_blksize_t;     /* preferred block size */
            ^
    note: '__int128_t' declared here
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:57:9: error: unknown type name '__int32_t'; did you mean '__int128_t'?
    typedef __int32_t       __darwin_dev_t;         /* dev_t */
            ^
    note: '__int128_t' declared here
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:60:9: error: unknown type name '__uint32_t'; did you mean '__uint128_t'?
    typedef __uint32_t      __darwin_gid_t;         /* [???] process and group IDs */
            ^
    note: '__uint128_t' declared here
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:61:9: error: unknown type name '__uint32_t'; did you mean '__uint128_t'?
    typedef __uint32_t      __darwin_id_t;          /* [XSI] pid_t, uid_t, or gid_t*/
            ^
    note: '__uint128_t' declared here
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:62:9: error: unknown type name '__uint64_t'
    typedef __uint64_t      __darwin_ino64_t;       /* [???] Used for 64 bit inodes */
            ^
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:68:9: error: unknown type name '__darwin_natural_t'
    typedef __darwin_natural_t __darwin_mach_port_name_t; /* Used by mach */
            ^
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:70:9: error: unknown type name '__uint16_t'; did you mean '__uint128_t'?
    typedef __uint16_t      __darwin_mode_t;        /* [???] Some file attributes */
            ^
    note: '__uint128_t' declared here
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:71:9: error: unknown type name '__int64_t'
    typedef __int64_t       __darwin_off_t;         /* [???] Used for file sizes */
            ^
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:72:9: error: unknown type name '__int32_t'; did you mean '__int128_t'?
    typedef __int32_t       __darwin_pid_t;         /* [???] process and group IDs */
            ^
    note: '__int128_t' declared here
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:73:9: error: unknown type name '__uint32_t'; did you mean '__uint128_t'?
    typedef __uint32_t      __darwin_sigset_t;      /* [???] signal set */
            ^
    note: '__uint128_t' declared here
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:74:9: error: unknown type name '__int32_t'; did you mean '__int128_t'?
    typedef __int32_t       __darwin_suseconds_t;   /* [???] microseconds */
            ^
    note: '__int128_t' declared here
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:75:9: error: unknown type name '__uint32_t'; did you mean '__uint128_t'?
    typedef __uint32_t      __darwin_uid_t;         /* [???] user IDs */
            ^
    note: '__uint128_t' declared here
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types.h:76:9: error: unknown type name '__uint32_t'; did you mean '__uint128_t'?
    typedef __uint32_t      __darwin_useconds_t;    /* [???] microseconds */
            ^
    note: '__uint128_t' declared here
    In file included from src/lxml/etree.c:97:
    In file included from /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8/Python.h:25:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/stdio.h:64:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_stdio.h:71:
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_types.h:43:9: error: unknown type name '__uint32_t'; did you mean '__uint128_t'?
    typedef __uint32_t      __darwin_wctype_t;
            ^
    note: '__uint128_t' declared here
    In file included from src/lxml/etree.c:97:
    In file included from /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/include/python3.8/Python.h:25:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/stdio.h:64:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/_stdio.h:75:
    In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/sys/_types/_va_list.h:31:
    /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/machine/types.h:37:2: error: architecture not supported
    #error architecture not supported
     ^
    fatal error: too many errors emitted, stopping now [-ferror-limit=]
    20 errors generated.
    Compile failed: command 'clang' failed with exit status 1
    creating var
    creating var/folders
    creating var/folders/6k
    creating var/folders/6k/h2vthrl93fq30l02dgxbkqd40000gn
    creating var/folders/6k/h2vthrl93fq30l02dgxbkqd40000gn/T
    cc -I/usr/local/Cellar/libxml2/2.9.10_2/include/libxml2 -I/usr/include/libxml2 -c /var/folders/6k/h2vthrl93fq30l02dgxbkqd40000gn/T/xmlXPathInit2n_1f1mg.c -o var/folders/6k/h2vthrl93fq30l02dgxbkqd40000gn/T/xmlXPathInit2n_1f1mg.o
    cc var/folders/6k/h2vthrl93fq30l02dgxbkqd40000gn/T/xmlXPathInit2n_1f1mg.o -L/usr/local/Cellar/libxml2/2.9.10_2/lib -lxml2 -o a.out
    ld: warning: dylib (/usr/local/Cellar/libxml2/2.9.10_2/lib/libxml2.dylib) was built for newer macOS version (10.15) than being linked (10.14.6)
    error: command 'clang' failed with exit status 1
    ----------------------------------------
ERROR: Command errored out with exit status 1: /Users/bradmin/gs-venv/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/6k/h2vthrl93fq30l02dgxbkqd40000gn/T/pip-install-y00_ddiy/lxml/setup.py'"'"'; __file__='"'"'/private/var/folders/6k/h2vthrl93fq30l02dgxbkqd40000gn/T/pip-install-y00_ddiy/lxml/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /private/var/folders/6k/h2vthrl93fq30l02dgxbkqd40000gn/T/pip-record-j3ana_h3/install-record.txt --single-version-externally-managed --compile --install-headers /Users/bradmin/gs-venv/include/site/python3.8/lxml Check the logs for full command output.

And now I'm stuck. I'm not sure what I can do as a next step towards installing this software.

ivan commented 3 years ago

Thanks for the report. I am not sure what is going wrong with that build, but grab-site would not work on Python 3.8 anyway because it has some dependencies liked namedlist that are stuck on Python 3.7.

I believe that only the Nix-based install process works at macOS at the moment. I just tested it on macOS 11 and it seemed to work, though for the Nix install I had to do:

sh <(curl -L https://nixos.org/nix/install) --darwin-use-unencrypted-nix-store-volume

After it installed I added this to ~/.zshrc to get nix-env in the PATH:

. /Users/USERNAME/.nix-profile/etc/profile.d/nix.sh

(followed by exec zsh at the prompt to restart zsh)

Then nix-env should be available to install grab-site as per the README.

bknowles commented 3 years ago

Hmm. So, it sounds like I need to de-install python 3.9 and then re-install python but make sure to force it to version 3.7.

Is there a way to require python 3.7 as a pre-requisite for grab-site, so that this sort of thing should hopefully happen automatically?

brandongalbraith commented 3 years ago

@bknowles Is running grab-site in a container an option for your use case? Might be more straightforward vs environments and Python version mgmt.

bknowles commented 3 years ago

I’ve done containers on AWS ECS before, but I haven’t done much with containers on macOS. I would be happy to give that a try, however.

bknowles commented 3 years ago

I do recall that Mac Mini does have 64GB of RAM, so a container based solution may actually work well.

If you can provide a link to the container based solution, I’ll take a look at implementing it.

Thanks!

bknowles commented 3 years ago

@brandongalbraith — I looked at all the specified installation instructions on the main page of this repo, and I didn’t find anything that looked like it was specific for a container-based installation. Am I missing something here?

brandongalbraith commented 3 years ago

@bknowles See if the Dockerfile provided in https://github.com/ArchiveTeam/grab-site/issues/159#issuecomment-675540463 works for you. Doesn't look like it ever made it into a proper /Dockerfile for the project, but I'll try to make some time to wrap up @raspher's contribution.

ivan commented 3 years ago

@bknowles I updated both the Homebrew and Nix install instructions on https://github.com/ArchiveTeam/grab-site#install-on-macos - they worked for me on macOS 11, please let me know if they work for you on your macOS.

ivan commented 3 years ago

Let me know if I need to reopen