I am an encountering an issue with the training of a keras neural network (TensorFlow backend).
Training the same keras model with same data inputs on two different machines (one AWS instance and one local laptop) results in two entirely different models. These models are being used for the purpose of regression and after training in the wrong machine (AWS instance), the model predicted totally flat outputs (whereas there was no such data in training).
I was able to reproduce the issue on AWS t2.medium and c5.large. It was running correctly on c5.xlarge
Python version: Python 2.7.15rc1
The specs of two machines are:
Machine A Specs (model working fine):
Model: Lenovo ThinkPad E570
product: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz
OS: Ubuntu 18.04.1 LTS
RAM: 16 GB
Libraries versions on both machines are same and are attached as: libraries_versions.txt
Log from Machine A (in which model works fine) is attached as correct_model.log
Log from Machine B (in which model works wrong) is attached as incorrect_model.log
Data input for both the codes is same and is attached in the GitHub code.
I am an encountering an issue with the training of a keras neural network (TensorFlow backend).
Training the same keras model with same data inputs on two different machines (one AWS instance and one local laptop) results in two entirely different models. These models are being used for the purpose of regression and after training in the wrong machine (AWS instance), the model predicted totally flat outputs (whereas there was no such data in training).
I was able to reproduce the issue on AWS t2.medium and c5.large. It was running correctly on c5.xlarge
Python version: Python 2.7.15rc1
The specs of two machines are:
Machine A Specs (model working fine): Model: Lenovo ThinkPad E570 product: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz OS: Ubuntu 18.04.1 LTS RAM: 16 GB
Machine B Specs (model working wrong): AWS t2.medium and c5.large ((using Amazon Linux 2 AMI)
Please note that the same code was tried on AWS instance c5.xlarge and was running correctly.
Code which reproduces problem: https://github.com/daroodar/minimum_reproducible_example_keras_issue. This code works fine in local machine and produces problems in AWS instances.
Libraries versions on both machines are same and are attached as: libraries_versions.txt
Log from Machine A (in which model works fine) is attached as correct_model.log Log from Machine B (in which model works wrong) is attached as incorrect_model.log
Data input for both the codes is same and is attached in the GitHub code.
correct_model.log incorrect_model.log libraries_versions.txt