rodrigo-arenas / Sklearn-genetic-opt

ML hyperparameters tuning and features selection, using evolutionary algorithms.
https://sklearn-genetic-opt.readthedocs.io
MIT License
289 stars 73 forks source link

[FEATURE] Save an evolved_estimator? #83

Closed siran1996 closed 2 years ago

siran1996 commented 2 years ago

How can I save an evolved_estimator into a file and read it?

rodrigo-arenas commented 2 years ago

Hi, you can use any Python serializer method, for example, joblib.dump, it should be something like:

from joblib import dump

dump(evolved_estimator , 'evolved_estimator.pkl')
imxtx commented 10 months ago

Hi, you can use any Python serializer method, for example, joblib.dump, it should be something like:

from joblib import dump

dump(evolved_estimator , 'evolved_estimator.pkl')

It's not working:

Traceback (most recent call last):
  File "/home/txxie/code/AnxietyPred/feature_selection.py", line 65, in <module>
    main()
  File "/home/txxie/code/AnxietyPred/feature_selection.py", line 50, in main
    dump(evolved_estimator, "evolved_estimator.test.pkl")
  File "/home/txxie/software/anaconda3/envs/anxiety/lib/python3.11/site-packages/joblib/numpy_pickle.py", line 553, in dump
    NumpyPickler(f, protocol=protocol).dump(value)
  File "/home/txxie/software/anaconda3/envs/anxiety/lib/python3.11/pickle.py", line 487, in dump
    self.save(obj)
  File "/home/txxie/software/anaconda3/envs/anxiety/lib/python3.11/site-packages/joblib/numpy_pickle.py", line 355, in save
    return Pickler.save(self, obj)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/txxie/software/anaconda3/envs/anxiety/lib/python3.11/pickle.py", line 603, in save
    self.save_reduce(obj=obj, *rv)
  File "/home/txxie/software/anaconda3/envs/anxiety/lib/python3.11/pickle.py", line 717, in save_reduce
    save(state)
  File "/home/txxie/software/anaconda3/envs/anxiety/lib/python3.11/site-packages/joblib/numpy_pickle.py", line 355, in save
    return Pickler.save(self, obj)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/txxie/software/anaconda3/envs/anxiety/lib/python3.11/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
    ^^^^^^^^^^^^
  File "/home/txxie/software/anaconda3/envs/anxiety/lib/python3.11/pickle.py", line 972, in save_dict
    self._batch_setitems(obj.items())
  File "/home/txxie/software/anaconda3/envs/anxiety/lib/python3.11/pickle.py", line 998, in _batch_setitems
    save(v)
  File "/home/txxie/software/anaconda3/envs/anxiety/lib/python3.11/site-packages/joblib/numpy_pickle.py", line 355, in save
    return Pickler.save(self, obj)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/txxie/software/anaconda3/envs/anxiety/lib/python3.11/pickle.py", line 578, in save
    rv = reduce(self.proto)
         ^^^^^^^^^^^^^^^^^^
TypeError: cannot pickle 'module' object

Here is my code, modified from the feature selection sample:

X, y = load_data("Data", frame_wise=True)  # (n_videos*n_frames, 1)
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.1, random_state=0
    )
    X_train = X_train[:1000, :]
    y_train = y_train[:1000]
    X_test = X_test[:100, :]
    y_test = y_test[:100]

    print(f"[X train] shape: {X_train.shape}, [y train] shape: {y_train.shape}")
    print(f"[X test] shape: {X_test.shape}, [y test] shape: {y_test.shape}")

    # pipline = Pipeline(
    #     (("scaler", StandardScaler()), ("linear_svc", LinearSVC(loss="hinge")))
    # )
    # pipline = Pipeline((("scaler", StandardScaler()), ("svc", SVC(gamma="auto"))))
    model = SVC(gamma="auto")

    mutation_scheduler = ExponentialAdapter(0.8, 0.2, 0.01)
    crossover_scheduler = ExponentialAdapter(0.2, 0.8, 0.01)

    evolved_estimator = GAFeatureSelectionCV(
        estimator=model,
        scoring="accuracy",
        population_size=100,
        generations=200,
        mutation_probability=mutation_scheduler,
        crossover_probability=crossover_scheduler,
        n_jobs=-1,
    )

    # Train and select the features
    callbacks = [TensorBoard(log_dir="./logs")]
    # evolved_estimator.fit(X_train, y_train, callbacks=callbacks)

    dump(evolved_estimator, "evolved_estimator.test.pkl")

    # Features selected by the algorithm
    features = evolved_estimator.support_
    print(features)

    # Predict only with the subset of selected features
    y_predict_ga = evolved_estimator.predict(X_test)
    print(accuracy_score(y_test, y_predict_ga))
    print(confusion_matrix(y_test, y_predict_ga))
    # Transform the original data to the selected features
    X_reduced = evolved_estimator.transform(X_test)