-
Currently, we use region heartbeat to do scheduling, one region one heartbeat stream. For every heartbeat stream, gRPC starts a goroutine, it works fine on small clusters, however, it may cause too mu…
-
### What's wrong?
As in title, metrics scraped by Grafana Agent sometimes have gigantic values. This issue happens to various metrics, coming from various components implemented mostly in Go and Erla…
-
is the utils.py file right?
your source code is as follows:
method = 'cluster_weighted_diounms'
#method = 'merge'
batched = 'batch' in method # run once per image, all classes simul…
-
This log was observed on a production cluster (on 23.2.6-rc):
```
E240604 18:00:01.675762 364555157 jobs/adopt.go:482 ⋮ [T1,Vsystem,n910] 41520 job 974695957267448729: adoption completed with err…
-
### What you would like to be added?
Currently, the TrainJob reconciler does UPSERT operations to create or update objects.
https://github.com/kubeflow/training-operator/blob/9e04bdd74920cbe12ecc7…
-
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
### Search before asking
- [X] I had searched in the [issues](https…
-
```
What steps will reproduce the problem?
cd "sofia-ml-read-only"
./sofia-kmeans --k 100 --init_type random --opt_type mini_batch_kmeans
--mini_batch_size 100 --iterations 1000 --cluster_mapping_ty…
-
```
What steps will reproduce the problem?
cd "sofia-ml-read-only"
./sofia-kmeans --k 100 --init_type random --opt_type mini_batch_kmeans
--mini_batch_size 100 --iterations 1000 --cluster_mapping_ty…
-
```
What steps will reproduce the problem?
cd "sofia-ml-read-only"
./sofia-kmeans --k 100 --init_type random --opt_type mini_batch_kmeans
--mini_batch_size 100 --iterations 1000 --cluster_mapping_ty…
-
```
What steps will reproduce the problem?
cd "sofia-ml-read-only"
./sofia-kmeans --k 100 --init_type random --opt_type mini_batch_kmeans
--mini_batch_size 100 --iterations 1000 --cluster_mapping_ty…