When we increase the CPU 1 to 2 in this run we did not get this warning message.
Got the error again. So maybe after the server has started and executed a run, preceding runs has not enough memory left. Notice that I had to always kill the previous processes for various other reasons. I am suspecting that some of the chrome remnants are being left over in the memory.
In the begening of the execution tensorflow throws some warning messages about system memory. The exact warning message is here:
tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 222499584 exceeds 10% of free system memory.
Example Logs: https://app.wandb.ai/hakanonal/geodashml/runs/1547rh10/logs