neptune-ai / neptune-sklearn

Experiment tracking for scikit-learn. 🧩 Log, organize, visualize and compare model metrics, parameters, dataset versions, and more.
https://docs.neptune.ai/integrations/sklearn/
Apache License 2.0
6 stars 2 forks source link

BUG: create_feature_importance_chart has side effect on the regressor argument when the regressor has one-dimensional coef_ #29

Closed pc-pallon closed 4 months ago

pc-pallon commented 4 months ago

Describe the bug

create_feature_importance_chart has side effect on the regressor argument when the regressor has one-dimensional coef_. This is the case when the regressor is fitted by a pd.Series in contrast to pd.DataFrame. It happens with the simplest example of a LinearRegression given in the sklearn documentation.

Reproduction

This is the minimal code that exhibits the unintended behavior:

import numpy as np
from sklearn.linear_model import LinearRegression
import neptune.integrations.sklearn as npt_utils

X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
y = np.dot(X, np.array([1, 2])) + 3

reg = LinearRegression().fit(X, y)

print(reg.coef_)
# Out: [1. 2.]

npt_utils.create_feature_importance_chart(reg, X, y)

print(reg.coef_)
# Out: [ 50. 100.]

Attached is the screenshot of a run of an equivalent code on Jupyter:

Screenshot 2024-07-03 at 15 35 08

Expected behavior

The printed reg.coef_ should be identical.

Traceback

N/A

Environment

The output of pip list: Related packages:

neptune                   1.10.4
neptune-sklearn           2.1.3
scikit-learn              1.5.0
scikit-plot               0.3.7
scipy                     1.11.4
All packages ``` agate 1.9.1 annotated-types 0.7.0 appdirs 1.4.4 arrow 1.3.0 attrs 23.2.0 Babel 2.15.0 boto3 1.34.121 botocore 1.34.121 bravado 11.0.3 bravado-core 6.1.1 cachetools 5.3.3 certifi 2024.2.2 cfgv 3.4.0 chardet 5.2.0 charset-normalizer 3.3.2 click 8.1.7 colorama 0.4.6 contourpy 1.2.1 cycler 0.12.1 daff 1.3.46 db-dtypes 1.2.0 dbt-adapters 1.3.2 dbt-bigquery 1.8.2 dbt-common 1.5.0 dbt-core 1.8.3 dbt-extractor 0.5.1 dbt-metabase 1.3.1 dbt-semantic-interfaces 0.5.1 deepdiff 7.0.1 diff_cover 9.1.0 distlib 0.3.8 exceptiongroup 1.2.1 filelock 3.15.4 fonttools 4.53.0 fqdn 1.5.1 future 1.0.0 gitdb 4.0.11 GitPython 3.1.43 google-api-core 2.19.0 google-auth 2.29.0 google-cloud-bigquery 3.23.1 google-cloud-core 2.4.1 google-cloud-dataproc 5.10.0 google-cloud-storage 2.17.0 google-crc32c 1.5.0 google-resumable-media 2.7.0 googleapis-common-protos 1.63.0 grpc-google-iam-v1 0.13.1 grpcio 1.64.0 grpcio-status 1.62.2 identify 2.5.36 idna 3.7 importlib-metadata 6.11.0 iniconfig 2.0.0 isodate 0.6.1 isoduration 20.11.0 Jinja2 3.1.4 jinja2-simple-tags 0.6.1 jmespath 1.0.1 joblib 1.4.2 jsonpointer 2.4 jsonref 1.1.0 jsonschema 4.22.0 jsonschema-specifications 2023.12.1 kiwisolver 1.4.5 leather 0.4.0 Logbook 1.5.3 markdown-it-py 3.0.0 MarkupSafe 2.1.5 mashumaro 3.13.1 matplotlib 3.9.0 mdurl 0.1.2 minimal-snowplow-tracker 0.0.2 monotonic 1.6 more-itertools 10.3.0 msgpack 1.0.8 neptune 1.10.4 neptune-sklearn 2.1.3 networkx 3.3 nodeenv 1.8.0 numpy 1.26.4 oauthlib 3.2.2 ordered-set 4.1.0 packaging 24.0 pandas 2.2.2 pandas-stubs 2.2.2.240514 parsedatetime 2.6 pathspec 0.12.1 pillow 10.3.0 pip 23.2.1 platformdirs 4.2.2 pluggy 1.5.0 pre-commit 3.2.2 proto-plus 1.23.0 protobuf 4.25.3 psutil 5.9.8 pyarrow 16.1.0 pyasn1 0.6.0 pyasn1_modules 0.4.0 pydantic 2.8.0 pydantic_core 2.20.0 Pygments 2.18.0 PyJWT 2.8.0 pyparsing 3.1.2 pyright 1.1.364 pytest 8.2.2 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 python-slugify 8.0.4 pytimeparse 1.1.8 pytz 2024.1 PyYAML 6.0.1 referencing 0.35.1 regex 2024.5.15 requests 2.32.2 requests-oauthlib 2.0.0 rfc3339-validator 0.1.4 rfc3986-validator 0.1.1 rich 13.7.1 rpds-py 0.18.1 rsa 4.9 ruamel.yaml 0.18.6 ruamel.yaml.clib 0.2.8 s3transfer 0.10.1 scikit-learn 1.5.0 scikit-plot 0.3.7 scipy 1.11.4 setuptools 70.0.0 simplejson 3.19.2 six 1.16.0 smmap 5.0.1 sqlfluff 2.3.5 sqlfluff-templater-dbt 2.3.5 sqlparse 0.5.0 swagger-spec-validator 3.0.3 tblib 3.0.0 text-unidecode 1.3 threadpoolctl 3.5.0 toml 0.10.2 tomli 2.0.1 tqdm 4.66.4 types-python-dateutil 2.9.0.20240316 types-pytz 2024.1.0.20240417 typing_extensions 4.12.1 tzdata 2024.1 uri-template 1.3.0 urllib3 2.2.1 virtualenv 20.26.3 webcolors 24.6.0 websocket-client 1.8.0 wheel 0.41.0 yellowbrick 1.5 zipp 3.19.2 ```

The operating system you're using: MacOS 14.5

The output of python --version: Python 3.10.12

Additional context

Not applicable

SiddhantSadangi commented 4 months ago

Hey @pc-pallon πŸ‘‹ Thanks for bringing this to our attention.

Looks like an easy fix. I'll let you know once the fix is released βœ…

SiddhantSadangi commented 4 months ago

Hey @pc-pallon πŸ‘‹

This should be fixed in neptune-sklearn 2.1.4 πŸš€

Can you please check and let me know?

pc-pallon commented 4 months ago

Hi @SiddhantSadangi, looks like it works. Thanks for the quick fix!