-
Why do I train according to readme and only change the batch size to 256, resulting in abnormally low performance? Is the parameter configuration incorrect? I did not make any modifications to the rem…
-
I use external decoding and then he always gets stuck.Why is that?And my nextcloud's memory keeps taking up until it's full.My decoding VM sets up model 3.
```
⚡ root@nextcloud ~ sudo -u www-da…
-
Create a new command to batch run kubectl commands across a number of clusters using the aliases setup.
From the discussion we'd use a file to configure the commands to run and aliases
```yaml
…
-
I am running an Azure ML pipeline for Machine Learning training on a low priority compute cluster. So, occasionally, the VM will be preempted and restarted at a later time. In this case, I want to res…
-
## In what area(s)?
/area placement
## Ask your question here
dapr uses 3 phase lock to sync actor state, IMO it introduces single point of failure, it's also costly especially in …
-
First, apologies if this is less than coherent as I've gone slightly insane (and pulled out half my beard) debugging this.
I'm trying to batch-create 3 servers, each in a different AWS subnet, using …
-
Currently, adding a replica to an empty cluster (i.e., one without replicas) rehydrates the new replica with all intermediate state accumulated since the last replica was dropped. This causes compute …
-
### How to use it?
- [X] kwok
- [ ] kwokctl --runtime=docker (default runtime)
- [ ] kwokctl --runtime=binary
- [ ] kwokctl --runtime=nerdctl
- [ ] kwokctl --runtime=kind
### What happened?
…
-
In our workflow, an HTTP server receives a request containing a payload for processing, starts a Dask cluster, splits the payload into batches and then submits each batch via Dask to the cluster (via …
-
Install file: "scripts/release_sanity.py" as "build/scripts/release_sanity.py"
/mnt/batch/tasks/shared/LS_root/mounts/clusters/carolinehu2022/code/Users/carolinehu2022/geophysics/esys-escript.github.…