Professor-G / MicroLIA

Gravitational microlensing classification engine using machine learning
GNU General Public License v3.0
12 stars 6 forks source link

Issue with ensemble_model #17

Closed ebachelet closed 1 year ago

ebachelet commented 1 year ago

I try to follow the readthedocs example, i.e. from MicroLIA import ensemble_models

model = models.Classifier(data_x, data_y) model.create()

Note first the typo with ensemble_models. But more problematic is the real import error:

from MicroLIA import ensemble_model *** ImportError: load_boston has been removed from scikit-learn since version 1.2.

The Boston housing prices dataset has an ethical problem: as investigated in [1], the authors of this dataset engineered a non-invertible variable "B" assuming that racial self-segregation had a positive impact on house prices [2]. Furthermore the goal of the research that led to the creation of this dataset was to study the impact of air quality but it did not give adequate demonstration of the validity of this assumption.

So this comes from the last version of scikit-learn (>=1.2). I believe the problems comes from the load of the Microlia.optimization, but not sure.

This is quite severe as I can not use the code at all.

rachel3834 commented 1 year ago

This function should just load a test dataset as an example. I believe that the upto date scikit package has other examples available, so it should be straightforward to fix by replacing this load call with one of the others.

On Thu, Jun 22, 2023 at 4:06 PM ebachelet @.***> wrote:

I try to follow the readthedocs example, i.e. from MicroLIA import ensemble_models

model = models.Classifier(data_x, data_y) model.create()

Note first the typo with ensemble_models. But more problematic is the real import error:

from MicroLIA import ensemble_model *** ImportError: load_boston has been removed from scikit-learn since version 1.2.

The Boston housing prices dataset has an ethical problem: as investigated in [1], the authors of this dataset engineered a non-invertible variable "B" assuming that racial self-segregation had a positive impact on house prices [2]. Furthermore the goal of the research that led to the creation of this dataset was to study the impact of air quality but it did not give adequate demonstration of the validity of this assumption.

So this comes from the last version of scikit-learn (>=1.2). I believe the problems comes from the load of the Microlia.optimization, but not sure.

This is quite severe as I can not use the code at all.

— Reply to this email directly, view it on GitHub https://github.com/Professor-G/MicroLIA/issues/17, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACPJA3FJVU3YQ6OOIEVBLJLXMSXV5ANCNFSM6AAAAAAZQX2TFI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Professor-G commented 1 year ago

Yes I note this some time ago and updated the imports! Designing unit tests now to validate all functionality

Professor-G commented 1 year ago

Source code is updated.