apple / tensorflow_macos

TensorFlow for macOS 11.0+ accelerated using Apple's ML Compute framework.
Other
3.67k stars 310 forks source link

Save/Load Model have different model evaluation #234

Open Arfius opened 3 years ago

Arfius commented 3 years ago

Hello. I get an anomaly during saving and loading models. I've trained and evaluated in the same jupyter notebook a model on the fashion_mnist dataset . I ran it in two machines: a M1 and an Intel i5. In particular, the notebook follows these step:

    • load dataset
    • train the model
    • evaluate the model
    • save the model in h5 file
    • load the model from the file above
    • evaluate the model

In the Intel i5 the step 3 and 6 have the same result, instead it is different in the M1. PDFs show the results step by step.

Thanks for your support

Training - Intel CORE i5 - 7th - Jupyter Notebook.pdf Training MacBookPro M1 - Jupyter Notebook.pdf

ghost commented 3 years ago

I have similar issues, I work on Reinforcement Learning with Tensorflow and Keras on my macbook M1. I train a NN on a game, it solves it properly, I even make a stability check by checking 10 times the NN gives the same result ... all the time it is good. I save it using the save_model. When I load it back using load_model, sometimes it gives the good result, sometimes not ... very strange ... I thinks there is an issue with the load / save function on the M1 version of tensorflow

ghost commented 3 years ago

I spend a lot of time testing different approach ... it seems that using the GPU is causing the issue. I put the following code at the beginning of my notebook and it seems to work fine now ... even really faster than with GPU: from tensorflow.python.framework.ops import disable_eager_execution disable_eager_execution() from tensorflow.python.compiler.mlcompute import mlcompute mlcompute.set_mlc_device(device_name='cpu') I know that there was a buggy version of tensorflow with load/save when using GPU in the past, maybe it is the one forked by apple for creating this mlcompute library ...

Arfius commented 3 years ago

Thanks , I will try for sure.

ghost commented 3 years ago

also, when saving your model, please try open(modelfile+'.json', 'w').write(model.to_json()) model.save_weights(modelfile+'.h5', overwrite=True) and when loading model = model_from_json(open(modelfile+'.json').read()) model.load_weights(modelfile+'.h5')

ghost commented 3 years ago

I have been turning around for weeks before I found that .... :) ... the fact that CPU run faster than GPU was quite a good and bad surprise (good because it improves a lot my training cycles, but bad as it is not supposed to work this way :) )

Arfius commented 3 years ago

dosen't work in my side, I've something messing up in my machine for sure.