appeler / ethnicolr

Predict Race and Ethnicity Based on the Sequence of Characters in a Name
http://ethnicolr.readthedocs.io
MIT License
234 stars 66 forks source link

Not able to use ethnicolr on Apple Silicon M1 - tensorflow 2.5.2 not available on conda or pip #61

Closed ospaarmann closed 2 years ago

ospaarmann commented 2 years ago

Hey, I recently moved a project to a new MacBook with a M1 chip and I now have issues getting everything to run. I'm using Mambaforge because it is recommended for Apple Silicon (previously I used Miniconda).

When I try to install ethnicolr 0.8.1 I get the error ERROR: Could not find a version that satisfies the requirement tensorflow==2.5.2 (from ethnicolr) (from versions: none). The available versions of tensorflow that I can install in my setup are 2.4.0, 2.4.1, 2.4.3, 2.6.0 and 2.6.2. I read in this issue that tensorflow 2.6 is not an option so I tried to install Tensorflow 2.5.

I tried to clone an existing conda environment with Tensorflow 2.5.0 following this description. Using this environment I was able to install the following requirements.txt, but only with pip install --no-deps.

pandas==1.3.3
tensorflow-macos==2.5.0
ethnicolr==0.7.0

First it looked like everything was working but when running pred_wiki_name I get the following error:

Metal device set to: Apple M1 Max

systemMemory: 64.00 GB
maxCacheSize: 21.33 GB

2021-12-22 16:38:03.701758: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2021-12-22 16:38:03.702015: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
WARNING:tensorflow:Layer lstm will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/ethnicolr/pred_wiki_name.py", line 76, in pred_wiki_name
    cls.model = load_model(MODEL)
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/saving/save.py", line 201, in load_model
    return hdf5_format.load_model_from_hdf5(filepath, custom_objects,
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/saving/hdf5_format.py", line 180, in load_model_from_hdf5
    model = model_config_lib.model_from_config(model_config,
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/saving/model_config.py", line 59, in model_from_config
    return deserialize(config, custom_objects=custom_objects)
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/layers/serialization.py", line 159, in deserialize
    return generic_utils.deserialize_keras_object(
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/utils/generic_utils.py", line 668, in deserialize_keras_object
    deserialized_obj = cls.from_config(
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/engine/sequential.py", line 499, in from_config
    model.add(layer)
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/training/tracking/base.py", line 522, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/engine/sequential.py", line 228, in add
    output_tensor = layer(self.outputs[0])
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/layers/recurrent.py", line 668, in __call__
    return super(RNN, self).__call__(inputs, **kwargs)
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/engine/base_layer.py", line 969, in __call__
    return self._functional_construction_call(inputs, args, kwargs,
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1107, in _functional_construction_call
    outputs = self._keras_tensor_symbolic_call(
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/engine/base_layer.py", line 840, in _keras_tensor_symbolic_call
    return self._infer_output_signature(inputs, args, kwargs, input_masks)
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/engine/base_layer.py", line 880, in _infer_output_signature
    outputs = call_fn(inputs, *args, **kwargs)
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/layers/recurrent_v2.py", line 1153, in call
    inputs, initial_state, _ = self._process_inputs(inputs, initial_state, None)
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/layers/recurrent.py", line 868, in _process_inputs
    initial_state = self.get_initial_state(inputs)
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/layers/recurrent.py", line 650, in get_initial_state
    init_state = get_initial_state_fn(
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/layers/recurrent.py", line 2516, in get_initial_state
    return list(_generate_zero_filled_state_for_cell(
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/layers/recurrent.py", line 2998, in _generate_zero_filled_state_for_cell
    return _generate_zero_filled_state(batch_size, cell.state_size, dtype)
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/layers/recurrent.py", line 3014, in _generate_zero_filled_state
    return nest.map_structure(create_zeros, state_size)
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/util/nest.py", line 867, in map_structure
    structure[0], [func(*x) for x in entries],
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/util/nest.py", line 867, in <listcomp>
    structure[0], [func(*x) for x in entries],
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/keras/layers/recurrent.py", line 3011, in create_zeros
    return array_ops.zeros(init_state_size, dtype=dtype)
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py", line 206, in wrapper
    return target(*args, **kwargs)
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/ops/array_ops.py", line 2911, in wrapped
    tensor = fun(*args, **kwargs)
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/ops/array_ops.py", line 2960, in zeros
    output = _constant_if_small(zero, shape, dtype, name)
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/ops/array_ops.py", line 2896, in _constant_if_small
    if np.prod(shape) < 1000:
  File "<__array_function__ internals>", line 5, in prod
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 3051, in prod
    return _wrapreduction(a, np.multiply, 'prod', axis, dtype, out,
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 86, in _wrapreduction
    return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper_m1/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 867, in __array__
    raise NotImplementedError(
NotImplementedError: Cannot convert a symbolic Tensor (lstm/strided_slice:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported

How do I run ethnicolr on an M1 chip? Has anyone successfully done so? Thanks

ospaarmann commented 2 years ago

So I got it to run. I'm not sure if this is ideal but I'm sharing my solution to maybe help anyone running into the same issue. I installed tensorflow 2.6.0 in a virtual environment using conda / mambaforge. I opted for 2.6.0 because 2.5.2 is not available for M1 and 2.5.0 wasn't working. You can read about installing Mamaforge here. After that, I installed Tensorflow with

conda install tensorflow==2.6.0

I also set the pandas version to pandas>1.2.3 in the requirements.txt of my own project (same as in ethnicolr). This resolved to pandas 1.3.3.

Next, I had to solve the dependency issue with ethnicolr, since ethnicolr requires tensorflow 2.5.2. I did that by forking the ethnicolr repo and creating a branch where I pin the tensorflow version to 2.6.0 in the requirements.txt and setup.py. You find this branch over here. To use this github branch, I changed the line in my requirements.txt to:

git+https://github.com/ospaarmann/ethnicolr.git@apple_m1_support_tensorflow_2_6_0#egg=ethnicolr

Now I had an issue with a dependency mismatch with numpy. It is described in this StackOverflow thread. What happened was that importing pandas or ethnicolr would throw this error:

>>> from ethnicolr import census_ln, pred_census_ln, pred_wiki_name
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper/lib/python3.8/site-packages/ethnicolr/__init__.py", line 2, in <module>
    from ethnicolr.census_ln import census_ln
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper/lib/python3.8/site-packages/ethnicolr/census_ln.py", line 6, in <module>
    import pandas as pd
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper/lib/python3.8/site-packages/pandas/__init__.py", line 22, in <module>
    from pandas.compat import (
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper/lib/python3.8/site-packages/pandas/compat/__init__.py", line 15, in <module>
    from pandas.compat.numpy import (
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper/lib/python3.8/site-packages/pandas/compat/numpy/__init__.py", line 7, in <module>
    from pandas.util.version import Version
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper/lib/python3.8/site-packages/pandas/util/__init__.py", line 1, in <module>
    from pandas.util._decorators import (  # noqa
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper/lib/python3.8/site-packages/pandas/util/_decorators.py", line 14, in <module>
    from pandas._libs.properties import cache_readonly  # noqa
  File "/Users/olespaarmann/mambaforge/envs/diversity_scraper/lib/python3.8/site-packages/pandas/_libs/__init__.py", line 13, in <module>
    from pandas._libs.interval import Interval
  File "pandas/_libs/interval.pyx", line 1, in init pandas._libs.interval
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject

The solution here is to just ignore the dependency issues and manually install a newer version of numpy. It doesn't work when I set the numpy version in my requirements.txt because this throws a dependency error:

ERROR: Cannot install numpy>=1.20.0 and tensorflow==2.6.0 because these package versions have conflicting dependencies.

The conflict is caused by:
    The user requested numpy>=1.20.0
    tensorflow 2.6.0 depends on numpy~=1.19.2

So I just installed it with python -m pip install numpy==1.20.0. And now everything seems to work.