ydoherty / CoastSat.PlanetScope

Batch shoreline extraction toolkit for PlanetScope Dove satellite imagery
GNU General Public License v3.0
48 stars 13 forks source link

Do you have the unpickled version of the model available? #17

Open 2320sharon opened 4 months ago

2320sharon commented 4 months ago

Hello,

First of all thank you for all your great work on coastsat.planetscope its a great tool and I appreciate that your code has comments explaining the workflow. I was wondering if you happen to have the unpickled version of the classifier you use on the planet imagery? The current pickle file will only work in the provided environment and I was hoping to port the model over to work in python 3.10 and above.

Again thank you for all your hard work.

Sharon

ydoherty commented 4 months ago

Hi Sharon,

I was chatting with @kvos yesterday (author of CoastSat) and this was also an issue for him. It's likely to do with the Scikit-Learn pkl process. Training models pickled with older versions of scikit-learn can't be loaded in versions newer than 0.20. His workaround was to load the pkl file in using and old scikit-learn version, save it to an intermediate file, then import that file and save the pkl file with a new version of scikit-learn.

Unfortunately I don't have an unpickled version of the classifier handy, but if you have a working environment I would try importing and re-exporting the classifier. Alternately the process of training a new classifier is fairly straightforward and the best way to get good results at sites other than Narrabeen (Aus)/Duck (USA) which is where I trained the default classifier.

Hopefully that helps.

Regards, Yarran

2320sharon commented 4 months ago

Hi @ydoherty

Thank you very much for getting back to me. Its unforunate, but understandable that you no longer have the unpickled version of the model around. Thank you for suggesting a potential solution. I attempted to load, then save, then re-load the model like you mentioned, but I'm a bit stuck. I was able to recreated the original environment and load the pickle model in that environment just fine. II only got stuck when I used a new version of joblib to save the model, then using the script below I attempted to load the new model, but that didn't work

By any chance do you or @kvos have the code you used to convert the models to the new format? I would greatly appreciate it.

Best,

Sharon

Code to Load & Save the model

settings = { 'classifier': 'NN_4classes_PS_NARRA.pkl',

}

class_path = os.path.join(os.getcwd(),'coastsat_ps', 'classifier', 'models', settings['classifier']) model = jb.load(class_path) print(model)

save the model using the latest joblib (modern pickle)

new_class_path = os.path.join(r"C:\development\coastseg-planet\CoastSeg-Planet", 'NN_4classes_PS_NARRA_new.pkl') new_jb.dump(model, new_class_path)


## Code to Load the New Pickled Model
- This code currently fails with the error message below. The same kind of error message with a different binary number occurs if you try to load the original pickled model as well
- Error message: 

Module: sklearn.neural_network.multilayer_perceptron, Name: MLPClassifier Module: sklearn.preprocessing.label, Name: LabelBinarizer Module: sklearn.externals.joblib.numpy_pickle, Name: NumpyArrayWrapper Module: numpy, Name: ndarray Module: numpy, Name: dtype Error occurred during unpickling: invalid load key, '\x00'.

### Code

import joblib import pickle import os

Define a custom Unpickler to handle module renaming

class CustomUnpickler(pickle.Unpickler): def find_class(self, module, name): print(f"Module: {module}, Name: {name}")

Map old module names to new module names

    module_replacements = {
        'sklearn.neural_network.multilayer_perceptron': 'sklearn.neural_network',
        'sklearn.preprocessing.label': 'sklearn.preprocessing',
        'sklearn.externals.joblib': 'joblib',
        'sklearn.externals.joblib.numpy_pickle': 'joblib.numpy_pickle',  # Handle the specific numpy_pickle case
        'numpy': 'numpy'  # Add numpy mappings
    }
    if module in module_replacements:
        module = module_replacements[module]
    return super().find_class(module, name)

Load the model using the custom Unpickler

class_path = r"C:\development\coastseg-planet\CoastSeg-Planet\NN_4classes_PS_NARRA.pkl" try: with open(class_path, 'rb') as file: loaded_model = CustomUnpickler(file).load() print(f"loaded_model: {loaded_model}") except (pickle.UnpicklingError, EOFError) as e: print(f"Error occurred during unpickling: {e}") except Exception as e: print(f"An unexpected error occurred: {e}")

kvos commented 4 months ago

@2320sharon I remember when that joblib issue arised a few years ago I asked Chris Leaman @chrisleaman (he was the king of python envs and conda) and he made me a new set of pickle files. I don't know how he did it though. You may want to ask him directly https://github.com/chrisleaman.

chrisleaman commented 4 months ago

@2320sharon, I think the problem (its been a while since I did this), is the sklearn models change between the different versions. So you need to load the sklearn model (which you've already done), then translate the model parameters to a new model that uses the new sklearn version. If you're you're using an IDE such as Pycharm or Spyder, so you should be able to see the model parameters using autocomplete (i.e. typing, model.<press tab>). Then, figure out where the relevant parameters with the data is stored and copy or move them to the new sklearn version model. This is because the new sklearn version would have slightly changed/added or renamed the parameters. Good luck!

2320sharon commented 4 months ago

Thank you @kvos and @chrisleaman for the help I appreciate it.

I'll try translating the model parameters from the old version of the model to the new version, then try copying the model data as well. Since you were able to transfer the CoastSat models to the new sklearn models without having to retrain them it must be possible to transfer all the model weights as well. I'll try using Spyder to transfer the model data instead of VSCode to see if I get some better results.

Thanks again for the help!