-
### What happened?
We are trying to upgrade Kuma from 2.5.2 to 2.8.2. We see an issue with external service connectivity post-upgrade. We have 500+ external services in our current setup.
Kuma con…
-
Trainer fails when you define a **custom_metric**, even for small models and small batch sizes. I tried different metrics as well, but it always fails. If you remove the custom_metric everything works…
-
[Lab 1, Step 2](https://hashicorp.github.io/service-mesh-training/exercises/lab-01/02-install-consul/#step-2-install-consul) fails within instruqt:
```shell
root@kubernetes:~/service-mesh-training…
-
All components have to follow a certain template
```
#Release#
- component_name|release-branch|release-notes
- component_name|release-branch|release-notes
```
please use the following component …
-
## ❓ Questions and Help
Hi, this might be a basic question but how do I increase the timeout of `xm.rendezvous()`? I'm training a large model and due to the system we're training on saving can take…
-
# 🚀 Feature & Motivation
PyTorch/XLA recently launched PyTorch/XLA SPMD ([RFC](https://github.com/pytorch/xla/issues/3871), [blog](https://pytorch.org/blog/pytorch-xla-spmd/), [docs/spmd.md](https:…
-
**Describe the bug**
I am trying to fine-tune the mT5 dataset on a custom dataset on a TPU on GCP. I am following carefully the process described in this repository however I have a tensorflow-relate…
-
1. result:
![image](https://github.com/Elsaam2y/DINet_optimized/assets/48466610/4b326bf3-020e-467f-903d-8f3b04be6523)
3. src:
![image](https://github.com/Elsaam2y/DINet_optimized/assets/48466610/…
-
Hi, thanks for sharing your work with everybody:
I have an issue concerning the ".ply" and "refined_ply" directory generation at the end of the training (on Windows machine):
Command => python t…
-
Hi Meetshah, I have t=done training . But during testing i got some errors.please go through the errors
(Gan_tf_Ajith) gpu@gpu:~/Desktop/Ajith Balakrishnan/3D Gan/tf-3dgan-master/src$ python 3dgan_…