Nexosis / alpine-xgboost

Dockerfile for alpine-xgboost image
Apache License 2.0
6 stars 9 forks source link

`Segmentation Fault / Killed` #2

Open aekasitt opened 6 years ago

aekasitt commented 6 years ago

This is a cloned post from what I posted directly on dmlc/xgboost issues page. You can find the original post here: Segmentation Fault / Killed when run on Python inside Docker Container (Alpine) #3649

I had built my project based from your Dockerfile and would like to see if you have experienced the same issue.


Within a docker container (based on alpine linux) with Python 2.7 installed, I receive a Segmentation Fault error that kills the entire process and then the container exits.

This DOES NOT happen during fitting but only happens when I use the loaded model to predict. My current script uses XGBoost alongside SKLearn's GridSearchCV. The gsv.fit method runs perfectly fine and then I save the gsv.best_estimator_ as a pickle to be later used. When I load the model from pickle and calls method predict, the container hangs for a while and exits with Segmentation Fault error message. (or sometimes the python script is Killed by the container.)

Build Specifications Based on this repo's Dockerfile When I run pip list here is the result for xgboost

xgboost                       0.80 

Here is a full list of other python libraries just in case

Package                       Version
----------------------------- -------
arrow                         0.12.1 
backports.functools-lru-cache 1.5    
numpy                         1.14.5 
pandas                        0.23.4 
pip                           18.0   
PyMySQL                       0.8.0  
python-dateutil               2.7.3  
pytz                          2018.5 
scikit-learn                  0.19.2 
scipy                         1.1.0  
setuptools                    39.0.1 
six                           1.11.0 
SQLAlchemy                    1.2.11 
thrift                        0.11.0 
wheel                         0.31.0 
xgboost                       0.80

Steps to Reproduce

  1. Create Training Model with GridSearchCV
  2. Save Training Model into a pickle file ...
  3. Load model from pickle file
  4. Use model.predict(dataframe)

What happens next Either the container Kills this python process OR I get an error that says Segmentation fault and the container dies.