-
**Describe the bug**
I'm trying to use the Llama2 model saved with `--use-dist-ckpt` after SFT (Supervised Fine-Tuning) to train a reward model. The reward model does not require the original checkpo…
-
RocksDB Java, Version 6.0.1
### Expected behavior
The test looks like:
```scala
property("record rewriting") {
withStore { store =>
val key = byteString("A")
val valA = by…
-
### 💡 Your Question
I am trying to recreate a minimal example for training a segmentation network (is YOLO-NAS going to support this in the future?) with COCO dataset.
I am successfully loading th…
-
### 🐛 Describe the bug
When running a notebook from [Quickstart](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html) using ROCm with Radeon RX 6900 XT on Ubuntu Server 22.04 I get…
-
## Bug report
If running in an Azure region where the virtual machine size is out of capacity, Nextflow immediately dies.
### Expected behavior and actual behavior
It should submit the autos…
-
Hi,
I have completed all the steps but at last it revert back with this error
Running forward
+ CUDA_VISIBLE_DEVICES=0 python demo.py --batch_size_v 80 --num_workers 4 --forward_save_path demo/…
-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussion…
-
### 🐛 Describe the bug
I'm training a vqgan model and there is a forward operation which do allreduce across batch to get an estimation of the data distribution. It successfully ran for hours and han…
-
### What happened + What you expected to happen
I have performance issues with running flower's simulation that uses Ray under the hood (https://github.com/adap/flower). This is a machine learning …
-
### What happened + What you expected to happen
I’m trying to test ray to see what it offers, so if it suits our case, deploy a ray cluster on Kubernetes. currently, I’m using a docker-compose to t…