Treelite version mismatch in both r21.10 and r21.11 #161

jiahong-liu commented 2 years ago

Treelite model created with the instruction run into error when serving in Triton: E1124 17:34:25.326361 1] failed to load 'fil_classification' version 1: Unknown: Model failed to load into Treelite with error: Cannot deserialize model from a different version of Treelite The conda environment created:

Name Version Build Channel

cudf 21.10.01 cuda_11.0_py38_ga1d2d13a14_0 rapidsai cuml 21.10.02 cuda11.0_py38_gcd9251271_0 rapidsai 1.1.0 pyhd8ed1ab_0 conda-forge jpeg 9d h36c2ea0_0 conda-forge krb5 1.19.2 hcc1bbae_3 conda-forge lcms2 2.12 hddcbb42_0 conda-forge ld_impl_linux-64 2.36.1 hea4e1c9_2 conda-forge lerc 3.0 h9c3ff4c_0 conda-forge libblas 3.9.0 12_linux64_openblas conda-forge libbrotlicommon 1.0.9 h7f98852_6 conda-forge libbrotlidec 1.0.9 h7f98852_6 conda-forge libbrotlienc 1.0.9 h7f98852_6 conda-forge libcblas 3.9.0 12_linux64_openblas conda-forge libcudf 21.10.01 cuda11.0_ga1d2d13a14_0 rapidsai libcuml 21.10.02 cuda11.0_gcd9251271_0 rapidsai libcumlprims 21.10.00 cuda11.0_g167dc59_0 nvidia libcurl 7.80.0 h2574ce0_0 conda-forge libdeflate 1.8 h7f98852_0 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libev 4.33 h516909a_1 conda-forge libevent 2.1.10 h9b69904_4 conda-forge libfaiss 1.7.0 cuda110h8045045_8_cuda conda-forge libffi 3.4.2 h7f98852_5 conda-forge libgcc-ng 11.2.0 h1d223b6_11 conda-forge libgfortran-ng 11.2.0 h69a702a_11 conda-forge libgfortran5 11.2.0 h5c6108e_11 conda-forge libgomp 11.2.0 h1d223b6_11 conda-forge libhwloc 2.3.0 h5e5b7d1_1 conda-forge libiconv 1.16 h516909a_0 conda-forge liblapack 3.9.0 12_linux64_openblas conda-forge libllvm10 10.0.1 he513fc3_3 conda-forge libllvm8 8.0.1 hc9558a2_0 conda-forge libnghttp2 1.43.0 h812cca2_1 conda-forge libnsl 2.0.0 h7f98852_0 conda-forge libopenblas 0.3.18 pthreads_h8fe5266_0 conda-forge libpng 1.6.37 h21135ba_2 conda-forge libprotobuf 3.18.1 h780b84a_0 conda-forge librmm 21.10.01 h2809392_0 conda-forge libssh2 1.10.0 ha56f1ee_2 conda-forge libstdcxx-ng 11.2.0 he4da1e4_11 conda-forge libthrift 0.15.0 he6d91bd_1 conda-forge libtiff 4.3.0 h6f004c6_2 conda-forge libutf8proc 2.6.1 h7f98852_0 conda-forge libwebp-base 1.2.1 h7f98852_0 conda-forge libxgboost 1.4.2dev.rapidsai21.10 cuda11.0_0 rapidsai libxml2 2.9.12 h885dcf4_1 conda-forge libzlib 1.2.11 h36c2ea0_1013 conda-forge lightgbm 3.3.1 py38h709712a_1 conda-forge llvmlite 0.36.0 py38h4630a5e_0 conda-forge locket 0.2.0 py_2 conda-forge lz4-c 1.9.3 h9c3ff4c_1 conda-forge markupsafe 2.0.1 py38h497a2fe_1 conda-forge mccabe 0.6.1 py_1 conda-forge msgpack-python 1.0.2 py38h1fd1430_2 conda-forge nccl h96e36e3_0 conda-forge ncurses 6.2 h58526e2_4 conda-forge numba 0.53.1 py38h8b71fd7_1 conda-forge numpy 1.21.4 py38he2449b9_0 conda-forge nvtx 0.2.3 py38h497a2fe_1 conda-forge olefile 0.46 pyh9f0ad1d_1 conda-forge openjpeg 2.4.0 hb52868f_1 conda-forge openssl 1.1.1l h7f98852_0 conda-forge orc 1.7.0 h68e2c4e_0 conda-forge packaging 21.3 pyhd8ed1ab_0 conda-forge pandas 1.3.4 py38h43a58ef_1 conda-forge parquet-cpp 1.5.1 2 conda-forge partd 1.2.0 pyhd8ed1ab_0 conda-forge pillow 8.4.0 py38h8e6f84c_0 conda-forge pip 21.3.1 pyhd8ed1ab_0 conda-forge protobuf 3.18.1 py38h709712a_0 conda-forge psutil 5.8.0 py38h497a2fe_2 conda-forge py-xgboost 1.4.2dev.rapidsai21.10 cuda11.0py38_0 rapidsai pyarrow 5.0.0 py38hed47224_8_cuda conda-forge pycodestyle 2.8.0 pyhd8ed1ab_0 conda-forge pyflakes 2.4.0 pyhd8ed1ab_0 conda-forge pyparsing 3.0.6 pyhd8ed1ab_0 conda-forge python 3.8.12 hb7a2778_2_cpython conda-forge
treelite 2.1.0 py38hdd725b4_0 conda-forge
treelite-runtime 2.1.0 pypi_0 pypi
tritonclient 2.16.0 pypi_0 pypi

I was using the following command to create the model: python qa/L0_e2e/ --type cuml --format pickle --name fil_classification --repo /home/ubuntu/models/ --task classification --predict_proba

The generate the model with the name model.pkl but Triton by default seems expecting

Run model serving with: docker run --gpus all --rm -p8000:8000 -p8001:8001 -p8002:8002 -v /home/ubuntu/models:/models tritonserver --model-repository=/models

This also fails for tritonserver:21.11-py3

wphicks commented 2 years ago

Thanks for the report @Jiahong-Nvidia! Was the conda environment you posted the one you used for the script? Could you add your invocation of that script as well?

jiahong-liu commented 2 years ago

I was actually using the qa/L0_e2e/ directly to generate a new example model instead of the conversion path.

wphicks commented 2 years ago

Please take a look at the documentation for cuML and Scikit-Learn models here. These models require a conversion step since the FIL backend does not make use of a Python interpreter. We will update the documentation to further emphasize this.