microsoft / LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
https://lightgbm.readthedocs.io/en/latest/
MIT License
16.69k stars 3.83k forks source link

[python-package] fit() segfaults #6025

Closed Realsumen closed 1 year ago

Realsumen commented 1 year ago

Hello team,

I ran the code below and ran into a strange behavior, the fit method kills the Jupyter kernel.

import numpy as np
import pandas as pd
import lightgbm as lgb
from sklearn.preprocessing import OrdinalEncoder

train = pd.read_csv([forestfires.csv](https://github.com/microsoft/LightGBM/files/12285726/forestfires.csv))
features = train.columns.to_list()
label = 'area'
features.remove('area')
lgb.LGBMRegressor().fit(train[features], train[label])

LightGBM version: lightgbm (4.0.0)

Command(s) used to install LightGBM

pip install lightgbm

Additional Comments

When I ran the the code in a .py file, this is the error message:

collecting ... collected 1 item

test.py::test Fatal Python error: Segmentation fault

Thread 0x00000001f964a500 (most recent call first): File "/Users/sumen/anaconda3/lib/python3.10/site-packages/lightgbm/basic.py", line 1990 in init_from_np2d File "/Users/sumen/anaconda3/lib/python3.10/site-packages/lightgbm/basic.py", line 1856 in _lazy_init File "/Users/sumen/anaconda3/lib/python3.10/site-packages/lightgbm/basic.py", line 2210 in construct File "/Users/sumen/anaconda3/lib/python3.10/site-packages/lightgbm/basic.py", line 3096 in init__ File

Could you please advise how would it be possible to tackle this point?

Thank you!

jameslamb commented 1 year ago

Thanks for using LightGBM and for the write-up with a reproducible example.

What operating system are you on? If on macOS, what version of OpenMP do you have installed? I ask in case you might be facing an issue like #4229.

Could you also check... do you have multiple copies of lightgbm installed? What does this command return?

find "${HOME}" -name 'lib_lightgbm.so'
Realsumen commented 1 year ago

Hi, I am using macOS. brew install libomp Warning: libomp 16.0.6 is already installed and up-to-date.

find "${HOME}" -name 'lib_lightgbm.so'

/Users/sumen/anaconda3/pkgs/lightgbm-3.3.5-py310h313beb8_0/lib/python3.10/site-packages/lightgbm/lib_lightgbm.so /Users/sumen/anaconda3/lib/python3.10/site-packages/lightgbm/lib/lib_lightgbm.so

jameslamb commented 1 year ago

Aha! Having those two copies of lib_lightgbm definitely seems like a problem. Especially since you said in your initial report that you are using v4.0.0, that lightgbm-3.3.5-.../lib_lightgbm.so looks like it's the problem.

Also, since both of these are in paths like ${HOME}/anaconda3/lib, I suspect you're operating only in the base conda environment, not a dedicated one.

Try this:

  1. Uninstall lightgbm using BOTH conda and pip
conda uninstall --yes lightgbm
pip uninstall --yes lightgbm

2a. Run the find command again.

# this should now return 0 results
find "${HOME}/anaconda3" -name 'lib_lightgbm.so'

2b. If that find command is still matching lightgbm-3.3.5-py310h313beb8_0/lib/python3.10/site-packages/lightgbm/lib_lightgbm.so, manually delete that file with rm

/Users/sumen/anaconda3/pkgs/lightgbm-3.3.5-py310h313beb8_0/lib/python3.10/site-packages/lightgbm/lib_lightgbm.so
  1. Re-install lightgbm, using conda
conda install -c conda-forge --yes lightgbm
  1. restart your jupyter kernel, re-run the code

Please let me know how that goes.

Realsumen commented 1 year ago

Hi, I tried reinstalled and got a dependency error.

conda install -c conda-forge --yes lightgbm
Collecting package metadata (current_repodata.json): done
Solving environment: \ 
The environment is inconsistent, please check the package plan carefully
The following packages are causing the inconsistency:

  - defaults/noarch::conda-pack==0.6.0=pyhd3eb1b0_0
  - defaults/noarch::tifffile==2021.7.2=pyhd3eb1b0_2
  - defaults/osx-arm64::zope.interface==5.4.0=py310h1a28f6b_0
  - defaults/osx-arm64::astropy==5.1=py310h96f19d2_0
  - defaults/osx-arm64::patsy==0.5.3=py310hca03da5_0
  - defaults/osx-arm64::panel==0.14.3=py310hca03da5_0
  - defaults/osx-arm64::bottleneck==1.3.5=py310h96f19d2_0
  - defaults/osx-arm64::scipy==1.10.0=py310h20cbe94_1
  - defaults/osx-arm64::anaconda-project==0.11.1=py310hca03da5_0
  - defaults/osx-arm64::conda-repo-cli==1.0.41=py310hca03da5_0
  - defaults/osx-arm64::qtconsole==5.4.0=py310hca03da5_0
  - defaults/osx-arm64::python-lsp-black==1.2.1=py310hca03da5_0
  - defaults/osx-arm64::nbclassic==0.5.2=py310hca03da5_0
  - defaults/osx-arm64::datashader==0.14.4=py310hca03da5_0
  - defaults/osx-arm64::scikit-learn==1.2.1=py310h313beb8_0
  - defaults/osx-arm64::pywavelets==1.4.1=py310h80987f9_0
  - defaults/osx-arm64::xarray==2022.11.0=py310hca03da5_0
  - defaults/noarch::backports.functools_lru_cache==1.6.4=pyhd3eb1b0_0
  - defaults/osx-arm64::python-lsp-server==1.7.1=py310hca03da5_0
  - defaults/osx-arm64::pytables==3.7.0=py310ha5d4e50_1
  - defaults/osx-arm64::hvplot==0.8.2=py310hca03da5_0
  - defaults/osx-arm64::gensim==4.3.0=py310h46d7db6_0
  - defaults/noarch::pyls-spyder==0.4.0=pyhd3eb1b0_0
  - defaults/osx-arm64::statsmodels==0.13.5=py310hbda83bc_1
  - defaults/osx-arm64::bokeh==2.4.3=py310hca03da5_0
  - defaults/osx-arm64::jupyterlab==3.5.3=py310hca03da5_0
  - defaults/osx-arm64::contourpy==1.0.5=py310h525c30c_0
  - defaults/osx-arm64::anaconda-client==1.11.2=py310hca03da5_0
  - defaults/osx-arm64::imageio==2.26.0=py310hca03da5_0
  - defaults/osx-arm64::numpy==1.23.5=py310hb93e574_0
  - defaults/osx-arm64::pyerfa==2.0.0=py310h1a28f6b_0
  - defaults/osx-arm64::matplotlib==3.7.0=py310hca03da5_0
  - defaults/osx-arm64::h5py==3.7.0=py310h181c318_0
  - defaults/osx-arm64::dask==2022.7.0=py310hca03da5_0
  - defaults/osx-arm64::datashape==0.5.4=py310hca03da5_1
  - defaults/osx-arm64::ipykernel==6.19.2=py310h33ce5c2_0
  - defaults/osx-arm64::anaconda-navigator==2.4.2=py310hca03da5_0
  - defaults/osx-arm64::scrapy==2.8.0=py310hca03da5_0
  - defaults/noarch::conda-verify==3.4.2=py_1
  - defaults/osx-arm64::twisted==22.2.0=py310h1a28f6b_1
  - defaults/osx-arm64::imbalanced-learn==0.10.1=py310hca03da5_0
  - defaults/osx-arm64::numba==0.56.4=py310h46d7db6_0
  - defaults/osx-arm64::conda-build==3.24.0=py310hca03da5_0
  - defaults/osx-arm64::imagecodecs==2021.8.26=py310h48bc37f_2
  - defaults/osx-arm64::scikit-image==0.19.3=py310h313beb8_1
  - defaults/osx-arm64::bcrypt==3.2.0=py310h1a28f6b_1
  - defaults/osx-arm64::holoviews==1.15.4=py310hca03da5_0
  - defaults/osx-arm64::conda==23.3.1=py310hca03da5_0
  - defaults/osx-arm64::transformers==4.24.0=py310hca03da5_0
  - defaults/osx-arm64::spyder==5.4.1=py310hca03da5_0
  - defaults/osx-arm64::navigator-updater==0.3.0=py310hca03da5_0
  - defaults/osx-arm64::notebook==6.5.2=py310hca03da5_0
  - defaults/osx-arm64::intake==0.6.7=py310hca03da5_0
  - defaults/osx-arm64::numexpr==2.8.4=py310hecc3335_0
  - defaults/osx-arm64::distributed==2022.7.0=py310hca03da5_0
  - defaults/osx-arm64::seaborn==0.12.2=py310hca03da5_0
  - defaults/noarch::conda-token==0.4.0=pyhd3eb1b0_0
  - defaults/osx-arm64::pandas==1.5.3=py310h46d7db6_0
  - defaults/osx-arm64::clyent==1.2.2=py310hca03da5_1
  - defaults/osx-arm64::spyder-kernels==2.4.1=py310hca03da5_0
  - defaults/osx-arm64::matplotlib-base==3.7.0=py310h46d7db6_0
  - defaults/osx-arm64::pytorch==1.12.1=cpu_py310h8370978_1
failed with initial frozen solve. Retrying with flexible solve.
Solving environment: done
Collecting package metadata (repodata.json): done
Solving environment: \ 
The environment is inconsistent, please check the package plan carefully
The following packages are causing the inconsistency:

  - defaults/noarch::conda-pack==0.6.0=pyhd3eb1b0_0
  - defaults/noarch::tifffile==2021.7.2=pyhd3eb1b0_2
  - defaults/osx-arm64::zope.interface==5.4.0=py310h1a28f6b_0
  - defaults/osx-arm64::astropy==5.1=py310h96f19d2_0
  - defaults/osx-arm64::patsy==0.5.3=py310hca03da5_0
  - defaults/osx-arm64::panel==0.14.3=py310hca03da5_0
  - defaults/osx-arm64::bottleneck==1.3.5=py310h96f19d2_0
  - defaults/osx-arm64::scipy==1.10.0=py310h20cbe94_1
  - defaults/osx-arm64::anaconda-project==0.11.1=py310hca03da5_0
  - defaults/osx-arm64::conda-repo-cli==1.0.41=py310hca03da5_0
  - defaults/osx-arm64::qtconsole==5.4.0=py310hca03da5_0
  - defaults/osx-arm64::python-lsp-black==1.2.1=py310hca03da5_0
  - defaults/osx-arm64::nbclassic==0.5.2=py310hca03da5_0
  - defaults/osx-arm64::datashader==0.14.4=py310hca03da5_0
  - defaults/osx-arm64::scikit-learn==1.2.1=py310h313beb8_0
  - defaults/osx-arm64::pywavelets==1.4.1=py310h80987f9_0
  - defaults/osx-arm64::xarray==2022.11.0=py310hca03da5_0
  - defaults/noarch::backports.functools_lru_cache==1.6.4=pyhd3eb1b0_0
  - defaults/osx-arm64::python-lsp-server==1.7.1=py310hca03da5_0
  - defaults/osx-arm64::pytables==3.7.0=py310ha5d4e50_1
  - defaults/osx-arm64::hvplot==0.8.2=py310hca03da5_0
  - defaults/osx-arm64::gensim==4.3.0=py310h46d7db6_0
  - defaults/noarch::pyls-spyder==0.4.0=pyhd3eb1b0_0
  - defaults/osx-arm64::statsmodels==0.13.5=py310hbda83bc_1
  - defaults/osx-arm64::bokeh==2.4.3=py310hca03da5_0
  - defaults/osx-arm64::jupyterlab==3.5.3=py310hca03da5_0
  - defaults/osx-arm64::contourpy==1.0.5=py310h525c30c_0
  - defaults/osx-arm64::anaconda-client==1.11.2=py310hca03da5_0
  - defaults/osx-arm64::imageio==2.26.0=py310hca03da5_0
  - defaults/osx-arm64::numpy==1.23.5=py310hb93e574_0
  - defaults/osx-arm64::pyerfa==2.0.0=py310h1a28f6b_0
  - defaults/osx-arm64::matplotlib==3.7.0=py310hca03da5_0
  - defaults/osx-arm64::h5py==3.7.0=py310h181c318_0
  - defaults/osx-arm64::dask==2022.7.0=py310hca03da5_0
  - defaults/osx-arm64::datashape==0.5.4=py310hca03da5_1
  - defaults/osx-arm64::ipykernel==6.19.2=py310h33ce5c2_0
  - defaults/osx-arm64::anaconda-navigator==2.4.2=py310hca03da5_0
  - defaults/osx-arm64::scrapy==2.8.0=py310hca03da5_0
  - defaults/noarch::conda-verify==3.4.2=py_1
  - defaults/osx-arm64::twisted==22.2.0=py310h1a28f6b_1
  - defaults/osx-arm64::imbalanced-learn==0.10.1=py310hca03da5_0
  - defaults/osx-arm64::numba==0.56.4=py310h46d7db6_0
  - defaults/osx-arm64::conda-build==3.24.0=py310hca03da5_0
  - defaults/osx-arm64::imagecodecs==2021.8.26=py310h48bc37f_2
  - defaults/osx-arm64::scikit-image==0.19.3=py310h313beb8_1
  - defaults/osx-arm64::bcrypt==3.2.0=py310h1a28f6b_1
  - defaults/osx-arm64::holoviews==1.15.4=py310hca03da5_0
  - defaults/osx-arm64::conda==23.3.1=py310hca03da5_0
  - defaults/osx-arm64::transformers==4.24.0=py310hca03da5_0
  - defaults/osx-arm64::spyder==5.4.1=py310hca03da5_0
  - defaults/osx-arm64::navigator-updater==0.3.0=py310hca03da5_0
  - defaults/osx-arm64::notebook==6.5.2=py310hca03da5_0
  - defaults/osx-arm64::intake==0.6.7=py310hca03da5_0
  - defaults/osx-arm64::numexpr==2.8.4=py310hecc3335_0
  - defaults/osx-arm64::distributed==2022.7.0=py310hca03da5_0
  - defaults/osx-arm64::seaborn==0.12.2=py310hca03da5_0
  - defaults/noarch::conda-token==0.4.0=pyhd3eb1b0_0
  - defaults/osx-arm64::pandas==1.5.3=py310h46d7db6_0
  - defaults/osx-arm64::clyent==1.2.2=py310hca03da5_1
  - defaults/osx-arm64::spyder-kernels==2.4.1=py310hca03da5_0
  - defaults/osx-arm64::matplotlib-base==3.7.0=py310h46d7db6_0
  - defaults/osx-arm64::pytorch==1.12.1=cpu_py310h8370978_1
done

==> WARNING: A newer version of conda exists. <==
  current version: 23.3.1
  latest version: 23.7.2

Please update conda by running

conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

conda install conda=23.7.2

Package Plan ##

  environment location: /Users/sumen/anaconda3

  added / updated specs:
    - lightgbm

The following NEW packages will be INSTALLED:

  comm               conda-forge/noarch::comm-0.1.4-pyhd8ed1ab_0 
  lightgbm           pkgs/main/osx-arm64::lightgbm-3.3.5-py310h313beb8_0 
  numpy-base         pkgs/main/osx-arm64::numpy-base-1.23.5-py310haf87e8b_0 
  pip                conda-forge/noarch::pip-23.2.1-pyhd8ed1ab_0 
  setuptools         conda-forge/noarch::setuptools-68.0.0-pyhd8ed1ab_0 
  wheel              conda-forge/noarch::wheel-0.41.1-pyhd8ed1ab_0 

The following packages will be UPDATED:

  ca-certificates    pkgs/main::ca-certificates-2023.05.30~ --> conda-forge::ca-certificates-2023.7.22-hf0a4a13_0 
  certifi            pkgs/main/osx-arm64::certifi-2023.5.7~ --> conda-forge/noarch::certifi-2023.7.22-pyhd8ed1ab_0 
  openssl              pkgs/main::openssl-1.1.1u-h1a28f6b_0 --> conda-forge::openssl-1.1.1v-h53f4e23_0 

Downloading and Extracting Packages

Preparing transaction: done
Verifying transaction: failed

RemoveError: 'setuptools' is a dependency of conda and cannot be removed from
conda's operating environment.
jameslamb commented 1 year ago

I've reformatted your post, to make the diference between your own words, commands you ran, and logs from those commands clearer. If you're not familiar with how to do that, please read https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax.


Based on your most recent message, it seems to me that your conda environment is broken. I suspect that that came from using a mix of conda install and lightgbm, or maybe from mixing packages from different conda channels.

That is not a lightgbm-specific problem... I suspect if you ran something like conda install -c conda-forge --yes xgboost, you'd see the same errors.

The risk of this happening is why I personally try to avoid using the base conda environment, and always create dedicated environments for different projects. Try the following.

  1. Completely uninstall conda and all packages you've installed with it
rm -r /Users/sumen/anaconda3
  1. Re-install conda: https://docs.conda.io/projects/conda/en/latest/user-guide/install/macos.html
  2. create a new environment for use with lightgbm + jupyterlab
conda create \
    --name ml-dev \
   -c conda-forge \
   --yes \
       ipykernel \
       jupyterlab \
       'lightgbm>=4.0.0' \
       numpy \
       'python=3.11' \
       scikit-learn \
       scipy
  1. Active that conda environment and start JupyterLab
source activate ml-dev
jupyter lab
  1. In JupyterLab, run lightgbm
import lightgbm as lgb
from sklearn.datasets import make_regression

X, y = make_regression(n_samples=1_000)
dtrain = lgb.Dataset(X, label=y)
bst = lgb.train(
    train_set=dtrain,
    params={
        "objective": "regression",
        "num_iterations": 10
    }
)
image
Realsumen commented 1 year ago

Hello, I followed your advice to set up a new environment and it works. Your guidance has been very helpful, and I appreciate for your assistance in this issue.

jameslamb commented 1 year ago

Great, glad it helped! Thanks for using LightGBM, come back any time 👋🏻

github-actions[bot] commented 1 year ago

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.