Closed FanhaiLu1 closed 1 month ago
Recent xla2 change call jax.devices() in init state, all the TPU been used by head, it caused all the worker throw below errors:
RuntimeError: Unable to initialize backend 'tpu': ABORTED: The TPU is already in use by process with pid ..
I submitted https://github.com/pytorch/xla/pull/7769 to fix the xla2 initialization issue. This PR applied the xla2 fix and updated the readme.
Recent xla2 change call jax.devices() in init state, all the TPU been used by head, it caused all the worker throw below errors:
I submitted https://github.com/pytorch/xla/pull/7769 to fix the xla2 initialization issue. This PR applied the xla2 fix and updated the readme.