pytorch / xla

Enabling PyTorch on XLA Devices (e.g. Google TPU)
https://pytorch.org/xla
Other
2.48k stars 480 forks source link

nightly version/ kaggle tpu #5039

Open dina-fahim103 opened 1 year ago

dina-fahim103 commented 1 year ago

❓ Questions and Help

Hi I installed pytorch xla nightly on kaggle notebook tpu, it was working fine but a week ago it keeps giving this error [FileNotFoundError: [Errno 2] No such file or directory: 'gsutil']

Opera Snapshot_2023-05-21_120122_www kaggle com

JackCaoG commented 1 year ago

Seems like somehow Kaggle got rid of the default gsutil installation. We used gsutil in this script https://github.com/pytorch/xla/blob/93ff6aa4e35866e70b764bd3385801ec80743075/contrib/scripts/env-setup.py#L131. That being said.. @will-cromar Do you know if all Kaggle has moved to TPUVM? The above script should only works for the TPU Node context..

will-cromar commented 1 year ago

I recommend using TPU VM with Kaggle instead of TPU Node. We have some examples here: https://github.com/pytorch/xla/tree/master/contrib/kaggle

https://www.kaggle.com/product-feedback/369338

pantheraleo-7 commented 1 month ago

We have some examples here: https://github.com/pytorch/xla/tree/master/contrib/kaggle

hey, is this still up-to-date?

coz if we exactly follow the code of this example notebook, we get some error, performance degradation

Specifically: os.environ.pop('CLOUD_TPU_TASK_ID') will raise a KeyError as the environment variable CLOUD_TPU_TASK_ID is unset, now. and I see noticeable performance degradation when using the notebook's loads of TPU specific code compared to just using vanilla torch + xm.xla_device() & xm.mark_step() but hey maybe that might be specific to me idk