Closed SimonCW closed 3 years ago
Temporarily solved by switching to the cf202003
label of the conda-forge
channel, i.e. using conda-forge/label/cf202003
as channel with channel priority set to strict. However, as far as I understand it, this means that we are restricted to the packages and dependencies with the cf202003 label.
This is likely because the machine on which the conda wheels are compiled has a different set of instructions available than the machine you're trying to run on.
There is some allowance for this here but clearly there are still cases where this fails.
Hi @maciejkula, thank you for your reply! I suspected something along those lines.
However, we are using pretty standard machines: AWS Batch for production and CI/CD and Lenovo Thinkpad Carbon X1 6th Gen for some Dev work. Here the cpuinfo
for the Thinkpad:
I feel that these are neihter very old nor exotic, and hence, I would expect the provided wheels on conda-forge to work for them.
Although I realize that this is not an issue of the LightFM source code, do you have any hints on how to dig little deeper into the bug to find out which dependency / instruction is causing the illegal hardware instruction
error?
Also, do you happen to know what changed in conda / conda-forge? I didn't find any changes indicating a change in their infrastruructure or compiler toolchain (unfortunately, I am not very proficient with the whole conda package creation toolchain). Heck, I didn't even find a good description of the cf202003
label (what exactly is frozen, when is this executed, etc).
fyi, my problem is not related to conda-forge, but figured I'd continue this thread since I suspect the root cause is the same.
I am getting the same error, Illegal instruction (core dumped)
, when trying to use LightFM().fit()
.
This happens if: Dockerfile:
RUN pip3 install lightfm==1.15
CMD python3 script_that_runs_lightfm.py
i.e. building the container on AWS CodeBuild and executing LightFM().fit()
when the container is run on AWS ECS
However, if I move the the installation of lightfm to the runtime environment, i.e.,
Dockerfile: CMD pip3 install lightfm==1.15 && python3 script_that_runs_lightfm.py
- everything works.
So, it seems like lightfm installation is very picky with regard to the installation environment (cpu?), never had any other library acting like this between AWS resources.
Any suggestions on how to avoid having to use this anti-pattern are welcome!
I check on a brand new Laptop and have the same problem. Running with PYTHONFAULTHANDLER=1
I get the error below. Maybe that helps @maciejkula?
(py37) ➜ debug_lightfm export PYTHONFAULTHANDLER=1
(py37) ➜ debug_lightfm python lfm_get_started.py
Fatal Python error: Illegal instruction
Current thread 0x00007f709d65a740 (most recent call first):
File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
File "<frozen importlib._bootstrap_external>", line 1043 in create_module
File "<frozen importlib._bootstrap>", line 583 in module_from_spec
File "<frozen importlib._bootstrap>", line 670 in _load_unlocked
File "<frozen importlib._bootstrap>", line 967 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 983 in _find_and_load
File "/home/simon/miniconda3/envs/py37/lib/python3.7/site-packages/lightfm/_lightfm_fast.py", line 3 in <module>
File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
File "<frozen importlib._bootstrap_external>", line 728 in exec_module
File "<frozen importlib._bootstrap>", line 677 in _load_unlocked
File "<frozen importlib._bootstrap>", line 967 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 983 in _find_and_load
File "/home/simon/miniconda3/envs/py37/lib/python3.7/site-packages/lightfm/lightfm.py", line 8 in <module>
File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
File "<frozen importlib._bootstrap_external>", line 728 in exec_module
File "<frozen importlib._bootstrap>", line 677 in _load_unlocked
File "<frozen importlib._bootstrap>", line 967 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 983 in _find_and_load
File "/home/simon/miniconda3/envs/py37/lib/python3.7/site-packages/lightfm/__init__.py", line 4 in <module>
File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
File "<frozen importlib._bootstrap_external>", line 728 in exec_module
File "<frozen importlib._bootstrap>", line 677 in _load_unlocked
File "<frozen importlib._bootstrap>", line 967 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 983 in _find_and_load
File "lfm_get_started.py", line 1 in <module>
[1] 26992 illegal hardware instruction python lfm_get_started.py
Could you point me towards the best place to address this? The conda-forge Gitter?
The machines used for conda-forge builds are using what Azure Pipelines provides and have quite a decent / broad set of instructions. Having a short look into the setup.py
, it seems setting LIGHTFM_NO_CFLAGS
in the feedstock build would solve by using conda-forge's standard set of CFLAGS.
I added a PR in the feedstock @maciejkula.
Thanks a lot @xhochy!
Thanks for figuring this out - changing the Conda build parameters sounds great.
Im closing this since it was solved by: https://github.com/lyst/lightfm/pull/563, https://github.com/conda-forge/lightfm-feedstock/pull/9, and https://github.com/conda-forge/lightfm-feedstock/pull/11
Installing from conda-forge should work now (I had to do conda clean --all --force-pkgs-dirs
but I think I had a few messed up intermediate versions in cache).
Issue: I get the error
Illegal hardware instruction (core dumped)
when installing from conda forge. Installing with pip is working fine. To reproduce build an environment withconda create -n test -c conda-forge python=3.7 lightfm
, then activateconda activate test
start a python interpreter and try toimport lightfm
As an fyi, I also opened an issue on the feedstock: https://github.com/conda-forge/lightfm-feedstock/issues/7
Environment (
conda list
):Details about
conda
and system (conda info
):