polm / fugashi

A Cython MeCab wrapper for fast, pythonic Japanese tokenization and morphological analysis.
MIT License
389 stars 31 forks source link

Update Installation Instructions for Apple Silicon MacOS? #43

Closed victorneo closed 2 years ago

victorneo commented 2 years ago

I am installing Fugashi on a M1 Macbook Air, and I had to install Mecab manually through homebrew first before I was able to install Fugashi. The error was "fugashi/fugashi.c:618:10: fatal error: 'mecab.h' file not found".

My suggestion would be to update the instructions that M1 Macbooks should manually install Mecab first. I am happy to work on a small PR to update the instructions, but wanted to check with you if that is necessary first or if you have other approaches in mind.

polm commented 2 years ago

You should not have to install MeCab prior to using fugashi, even on an M1, and I would not recommend it in general.

If you look at #28, around that time pip was upgraded so that it should be able to use x86_64 wheels on M1 machines using the binary translation feature. If you're getting an error about mecab.h it means it's trying to install from source, which is possible but not what most users should have to do.

What version of pip are you using?

victorneo commented 2 years ago

I was using pip 21.3 when I was installing fugashi yesterday, and I did see the PR you mentioned. It gave me the hint that I probably had to install mecab first, which finally allowed me to install fugashi.

As I'm running some long parsing script at the moment and I can't uninstall mecab to retry this installation process again, I think we can probably close this first and I (or someone else) will report back if we face issues with installation via pip again.

polm commented 2 years ago

Let's leave this open for now. pip 21.3 should be recent enough to have the required changes. I don't have any Apple hardware but I'll ask someone I know to test it, because I'm pretty sure this should just work.

indrasvat commented 2 years ago

Could not install fugashi on my Apple M1 in a docker image.

Environment:

$ docker version
Client: Docker Engine - Community
 Version:           20.10.11
 API version:       1.41
 Go version:        go1.17.2
 Git commit:        dea9396e18
 Built:             Wed Nov 17 23:49:46 2021
 OS/Arch:           darwin/arm64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.11
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.9
  Git commit:       847da18
  Built:            Thu Nov 18 00:34:44 2021
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.4.12
  GitCommit:        7b11cfaabd73bb80907dd23182b9347b4245eb5d
 runc:
  Version:          1.0.2
  GitCommit:        v1.0.2-0-g52b36a2
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

requirements.txt:

fugashi>=1.1.0

Dockerfile:

FROM python:3.8.6 as builder

WORKDIR /app

COPY requirements.txt .

RUN python -m pip install --upgrade pip && python -V && pip -V && pip install -q -r requirements.txt

Build Log:

$ BUILDKIT_PROGRESS=plain docker build -t local-fugashi .
#1 [internal] load build definition from Dockerfile
#1 sha256:aab3dd1bdf4576345fa91c816ec6d8d1225a175d81bbc0ab91a2e630d8452138
#1 transferring dockerfile: 220B 0.0s done
#1 DONE 0.0s

#2 [internal] load .dockerignore
#2 sha256:8f6ea821310c117641429fd2586f180484f5aad32fa3de0a60967c947ac10e98
#2 transferring context: 34B done
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/library/python:3.8.6
#3 sha256:1e9c67d3ae3f8110b5f423cb30f0c47ca2b85fed3becca49e550ef7c3c9f7f42
#3 DONE 0.0s

#4 [1/4] FROM docker.io/library/python:3.8.6
#4 sha256:0547c1628658b75275e88627f7a54bbf9240e750db43fa0c1769a5c239b14c47
#4 DONE 0.0s

#6 [internal] load build context
#6 sha256:9bd888a2e75c98f26c93a43f0302925ded569c1cf870807691c4ba93199289f3
#6 transferring context: 237B 0.0s done
#6 DONE 0.0s

#5 [2/4] WORKDIR /app
#5 sha256:6cb04f76923c75092b8a702607f643c8cccd07a3a234a57faaae6bc55268d034
#5 CACHED

#7 [3/4] COPY requirements.txt .
#7 sha256:fc70b61a11d0ea16fe60fcaf93affa12310b59f5dbc9e42b7914864508709724
#7 DONE 0.0s

#8 [4/4] RUN python -m pip install --upgrade pip && python -V && pip -V && pip install -q -r requirements.txt
#8 sha256:16ffb60d1148aa9a1d3f3e07964a00d592df5c533a28ed00d93c5f6151f0cb11
#8 0.915 Requirement already satisfied: pip in /usr/local/lib/python3.8/site-packages (20.3.3)
#8 1.461 Collecting pip
#8 1.900   Downloading pip-21.3.1-py3-none-any.whl (1.7 MB)
#8 3.871 Installing collected packages: pip
#8 3.871   Attempting uninstall: pip
#8 3.872     Found existing installation: pip 20.3.3
#8 3.937     Uninstalling pip-20.3.3:
#8 4.028       Successfully uninstalled pip-20.3.3
#8 4.517 Successfully installed pip-21.3.1

#8 4.916 Python 3.8.6
#8 5.043 pip 21.3.1 from /usr/local/lib/python3.8/site-packages/pip (python 3.8)

#8 36.80   ERROR: Command errored out with exit status 1:
#8 36.80    command: /usr/local/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-rq9n5i5f/fugashi_d9a80464e7ff486bb2be32072a730490/setup.py'"'"'; __file__='"'"'/tmp/pip-install-rq9n5i5f/fugashi_d9a80464e7ff486bb2be32072a730490/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-0cfmipir
#8 36.80        cwd: /tmp/pip-install-rq9n5i5f/fugashi_d9a80464e7ff486bb2be32072a730490/
#8 36.80   Complete output (89 lines):
#8 36.80   Reading package lists...
#8 36.80   Building dependency tree...
#8 36.80   Reading state information...
#8 36.80   E: Unable to locate package libmecab-dev
#8 36.80   E: Unable to locate package libmecab-dev
#8 36.80   E: Unable to locate package libmecab2
#8 36.80   fatal: destination path 'mecab' already exists and is not an empty directory.
#8 36.80   checking for a BSD-compatible install... /usr/bin/install -c
#8 36.80   checking whether build environment is sane... yes
#8 36.80   checking for a thread-safe mkdir -p... /bin/mkdir -p
#8 36.80   checking for gawk... no
#8 36.80   checking for mawk... mawk
#8 36.80   checking whether make sets $(MAKE)... yes
#8 36.80   checking for gcc... gcc
#8 36.80   checking whether the C compiler works... yes
#8 36.80   checking for C compiler default output file name... a.out
#8 36.80   checking for suffix of executables...
#8 36.80   checking whether we are cross compiling... no
#8 36.80   checking for suffix of object files... o
#8 36.80   checking whether we are using the GNU C compiler... yes
#8 36.80   checking whether gcc accepts -g... yes
#8 36.80   checking for gcc option to accept ISO C89... none needed
#8 36.80   checking for style of include used by make... GNU
#8 36.80   checking dependency style of gcc... none
#8 36.80   checking for g++... g++
#8 36.80   checking whether we are using the GNU C++ compiler... yes
#8 36.80   checking whether g++ accepts -g... yes
#8 36.80   checking dependency style of g++... none
#8 36.80   checking how to run the C preprocessor... gcc -E
#8 36.80   checking for grep that handles long lines and -e... /bin/grep
#8 36.80   checking for egrep... /bin/grep -E
#8 36.80   checking whether gcc needs -traditional... no
#8 36.80   checking whether make sets $(MAKE)... (cached) yes
#8 36.80   checking build system type... ./config.guess: unable to guess system type
#8 36.80   
#8 36.80   This script, last modified 2011-05-11, has failed to recognize
#8 36.80   the operating system you are using. It is advised that you
#8 36.80   download the most up to date version of the config scripts from
#8 36.80   
#8 36.80     http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.guess;hb=HEAD
#8 36.80   and
#8 36.80     http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.sub;hb=HEAD
#8 36.80   
#8 36.80   If the version you run (./config.guess) is already up to date, please
#8 36.80   send the following data and any information you think might be
#8 36.80   pertinent to <config-patches@gnu.org> in order to provide the needed
#8 36.80   information to handle your system.
#8 36.80   
#8 36.80   config.guess timestamp = 2011-05-11
#8 36.80   
#8 36.80   uname -m = aarch64
#8 36.80   uname -r = 5.10.76-linuxkit
#8 36.80   uname -s = Linux
#8 36.80   uname -v = #1 SMP PREEMPT Mon Nov 8 11:22:26 UTC 2021
#8 36.80   
#8 36.80   /usr/bin/uname -p =
#8 36.80   /bin/uname -X     =
#8 36.80   
#8 36.80   hostinfo               =
#8 36.80   /bin/universe          =
#8 36.80   /usr/bin/arch -k       =
#8 36.80   /bin/arch              =
#8 36.80   /usr/bin/oslevel       =
#8 36.80   /usr/convex/getsysinfo =
#8 36.80   
#8 36.80   UNAME_MACHINE = aarch64
#8 36.80   UNAME_RELEASE = 5.10.76-linuxkit
#8 36.80   UNAME_SYSTEM  = Linux
#8 36.80   UNAME_VERSION = #1 SMP PREEMPT Mon Nov 8 11:22:26 UTC 2021
#8 36.80   configure: error: cannot guess build type; you must specify one
#8 36.80   make: *** No rule to make target 'libmecab.la'.  Stop.
#8 36.80   running bdist_wheel
#8 36.80   running build
#8 36.80   running build_py
#8 36.80   creating build/lib.linux-aarch64-3.8
#8 36.80   creating build/lib.linux-aarch64-3.8/fugashi
#8 36.80   copying fugashi/__init__.py -> build/lib.linux-aarch64-3.8/fugashi
#8 36.80   copying fugashi/cli.py -> build/lib.linux-aarch64-3.8/fugashi
#8 36.80   running build_ext
#8 36.80   cythoning fugashi/fugashi.pyx to fugashi/fugashi.c
#8 36.80   building 'fugashi.fugashi' extension
#8 36.80   creating build/temp.linux-aarch64-3.8
#8 36.80   creating build/temp.linux-aarch64-3.8/fugashi
#8 36.80   gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/usr/local/include/python3.8 -c fugashi/fugashi.c -o build/temp.linux-aarch64-3.8/fugashi/fugashi.o
#8 36.80   fugashi/fugashi.c:679:10: fatal error: mecab.h: No such file or directory
#8 36.80    #include "mecab.h"
#8 36.80             ^~~~~~~~~
#8 36.80   compilation terminated.
#8 36.80   error: command 'gcc' failed with exit status 1
#8 36.80   ----------------------------------------
#8 36.80   ERROR: Failed building wheel for fugashi
#8 50.49     ERROR: Command errored out with exit status 1:
#8 50.49      command: /usr/local/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-rq9n5i5f/fugashi_d9a80464e7ff486bb2be32072a730490/setup.py'"'"'; __file__='"'"'/tmp/pip-install-rq9n5i5f/fugashi_d9a80464e7ff486bb2be32072a730490/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-7ynrytrt/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.8/fugashi
#8 50.49          cwd: /tmp/pip-install-rq9n5i5f/fugashi_d9a80464e7ff486bb2be32072a730490/
#8 50.49     Complete output (89 lines):
#8 50.49     Reading package lists...
#8 50.49     Building dependency tree...
#8 50.49     Reading state information...
#8 50.49     E: Unable to locate package libmecab-dev
#8 50.49     E: Unable to locate package libmecab-dev
#8 50.49     E: Unable to locate package libmecab2
#8 50.49     fatal: destination path 'mecab' already exists and is not an empty directory.
#8 50.49     checking for a BSD-compatible install... /usr/bin/install -c
#8 50.49     checking whether build environment is sane... yes
#8 50.49     checking for a thread-safe mkdir -p... /bin/mkdir -p
#8 50.49     checking for gawk... no
#8 50.49     checking for mawk... mawk
#8 50.49     checking whether make sets $(MAKE)... yes
#8 50.49     checking for gcc... gcc
#8 50.49     checking whether the C compiler works... yes
#8 50.49     checking for C compiler default output file name... a.out
#8 50.49     checking for suffix of executables...
#8 50.49     checking whether we are cross compiling... no
#8 50.49     checking for suffix of object files... o
#8 50.49     checking whether we are using the GNU C compiler... yes
#8 50.49     checking whether gcc accepts -g... yes
#8 50.49     checking for gcc option to accept ISO C89... none needed
#8 50.49     checking for style of include used by make... GNU
#8 50.49     checking dependency style of gcc... none
#8 50.49     checking for g++... g++
#8 50.49     checking whether we are using the GNU C++ compiler... yes
#8 50.49     checking whether g++ accepts -g... yes
#8 50.49     checking dependency style of g++... none
#8 50.49     checking how to run the C preprocessor... gcc -E
#8 50.49     checking for grep that handles long lines and -e... /bin/grep
#8 50.49     checking for egrep... /bin/grep -E
#8 50.49     checking whether gcc needs -traditional... no
#8 50.49     checking whether make sets $(MAKE)... (cached) yes
#8 50.49     checking build system type... ./config.guess: unable to guess system type
#8 50.49     
#8 50.49     This script, last modified 2011-05-11, has failed to recognize
#8 50.49     the operating system you are using. It is advised that you
#8 50.49     download the most up to date version of the config scripts from
#8 50.49     
#8 50.49       http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.guess;hb=HEAD
#8 50.49     and
#8 50.49       http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.sub;hb=HEAD
#8 50.49     
#8 50.49     If the version you run (./config.guess) is already up to date, please
#8 50.49     send the following data and any information you think might be
#8 50.49     pertinent to <config-patches@gnu.org> in order to provide the needed
#8 50.49     information to handle your system.
#8 50.49     
#8 50.49     config.guess timestamp = 2011-05-11
#8 50.49     
#8 50.49     uname -m = aarch64
#8 50.49     uname -r = 5.10.76-linuxkit
#8 50.49     uname -s = Linux
#8 50.49     uname -v = #1 SMP PREEMPT Mon Nov 8 11:22:26 UTC 2021
#8 50.49     
#8 50.49     /usr/bin/uname -p =
#8 50.49     /bin/uname -X     =
#8 50.49     
#8 50.49     hostinfo               =
#8 50.49     /bin/universe          =
#8 50.49     /usr/bin/arch -k       =
#8 50.49     /bin/arch              =
#8 50.49     /usr/bin/oslevel       =
#8 50.49     /usr/convex/getsysinfo =
#8 50.49     
#8 50.49     UNAME_MACHINE = aarch64
#8 50.49     UNAME_RELEASE = 5.10.76-linuxkit
#8 50.49     UNAME_SYSTEM  = Linux
#8 50.49     UNAME_VERSION = #1 SMP PREEMPT Mon Nov 8 11:22:26 UTC 2021
#8 50.49     configure: error: cannot guess build type; you must specify one
#8 50.49     make: *** No rule to make target 'libmecab.la'.  Stop.
#8 50.49     running install
#8 50.49     running build
#8 50.49     running build_py
#8 50.49     creating build/lib.linux-aarch64-3.8
#8 50.49     creating build/lib.linux-aarch64-3.8/fugashi
#8 50.49     copying fugashi/__init__.py -> build/lib.linux-aarch64-3.8/fugashi
#8 50.49     copying fugashi/cli.py -> build/lib.linux-aarch64-3.8/fugashi
#8 50.49     running build_ext
#8 50.49     skipping 'fugashi/fugashi.c' Cython extension (up-to-date)
#8 50.49     building 'fugashi.fugashi' extension
#8 50.49     creating build/temp.linux-aarch64-3.8
#8 50.49     creating build/temp.linux-aarch64-3.8/fugashi
#8 50.49     gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/usr/local/include/python3.8 -c fugashi/fugashi.c -o build/temp.linux-aarch64-3.8/fugashi/fugashi.o
#8 50.49     fugashi/fugashi.c:679:10: fatal error: mecab.h: No such file or directory
#8 50.49      #include "mecab.h"
#8 50.49               ^~~~~~~~~
#8 50.49     compilation terminated.
#8 50.49     error: command 'gcc' failed with exit status 1
#8 50.49     ----------------------------------------
#8 50.49 ERROR: Command errored out with exit status 1: /usr/local/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-rq9n5i5f/fugashi_d9a80464e7ff486bb2be32072a730490/setup.py'"'"'; __file__='"'"'/tmp/pip-install-rq9n5i5f/fugashi_d9a80464e7ff486bb2be32072a730490/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-7ynrytrt/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.8/fugashi Check the logs for full command output.
#8 ERROR: executor failed running [/bin/sh -c python -m pip install --upgrade pip && python -V && pip -V && pip install -q -r requirements.txt]: exit code: 1
------
 > [4/4] RUN python -m pip install --upgrade pip && python -V && pip -V && pip install -q -r requirements.txt:
------
executor failed running [/bin/sh -c python -m pip install --upgrade pip && python -V && pip -V && pip install -q -r requirements.txt]: exit code: 1
polm commented 2 years ago

OK, thanks for the report. It sounds like on the M1 it tries to install from source since there's no aarch64 wheels.

If you get this error, you can fix it and install from source by first installing MeCab.

I'll work on getting aarch64 wheels released.

indrasvat commented 2 years ago

Thanks, @polm. Yeah, I was able to get around it by explicitly installing mecab first.

FROM python:3.8.6 as builder

RUN apt-get update && apt-get install -y mecab

WORKDIR /app

COPY requirements.txt .

RUN python -m pip install --upgrade pip && python -V && pip -V && pip install -q -r requirements.txt
polm commented 2 years ago

Closing since we figured this out, though I still need to work on those aarch64 wheels.