Results are not reproducible with adam / rmsprop optimizer

Hi folks, I'm training lots of models and try to optimize certain hyperparameters. After running the code several times, I noticed that I get different results(RMSE accuracy) when using the adam or rmsprop optimizer. The RMSE usually lies between 0 and 1. The model is a deep Autoencoder which tries to fill missing values. Missing values are represented by zeros. I've mapped 20% of my data to zero. So it's job is to reconstruct the 20%.

System information
Windows 10 Microsoft Windows [Version 10.0.18362.418]

TensorFlow backend (yes / no): yes
TensorFlow version: 2.1
Keras version:
-- keras-applications 1.0.8 py_0 -- keras-preprocessing 1.1.0 py_1
Python version: 3.7.6
CUDA/cuDNN version: Cuda compilation tools, release 10.0, V10.0.130
GPU model and memory: NVIDIA 2060 Super 8GB

Describe the current behavior
When I train a model with either the adam or rmsprop optimizer, I get different results with each run. Random seed is set exactly before creating the model. Other optimizers work flawlessly .

What I've tested so far:

Cast input data to float64
Set tf.keras.mixed_precision.experimental.Policy('float64')
Increasing tf.keras.backend.set_epsilon() up to 1e-3
I've also set the parameter epsilon of the optimizer to 1, 10 or even 50 and this seemed to solve this issue often, but I don't understand why. The parameter helps to avoid divisions by zero. Does it mean that my gradients are really close to zero because most of the data is zero?
Set tf.keras.backend.set_floatx('float64')
Batch Normalization on or off

Further information:

Data are highly sparse (>90% is zero)
Data lie between 0 and 5 in steps of 0.5 or 1
Activation functions: 'elu','relu','selu','sigmoid','softplus','softsign','tanh'

keras-team / keras

Results are not reproducible with adam / rmsprop optimizer #13921