-
**Describe the bug**
I'm trying to train RT DETR model on custom dataset
This is the error I'm getting
Traceback (most recent call last):
File "/RT-DETR/rtdetrv2_pytorch/tools/train.py", line …
-
This golf produces inconsistent results for the number of matches seen, because `$/` is shared between threads.
Before https://github.com/rakudo/rakudo/commit/29a032138c this would actually crash
…
-
my loss curve looks something like this.
model_name: "flux-dev"
data_config:
train_batch_size: 1
num_workers: 4
img_size: 512
img_dir: xxx
report_to: wandb
train_batch_size: 1
out…
-
PBS apparently has the ability to create directories and install files on the job host, before executing a job, and to push files back after the job.
On systems with this capability Cylc could pre…
-
First of all I wanna say sorry for asking so many questions. I'm sure I've overwhelmed you and I apologize for it, it won't happen again. This mostly should be all the remaining questions I have.
…
-
Hi,
I have used the captioning nodes and they worked fine, but when I try to run the lora node, I get the below issue. There seems to be an issue with getting it to recognise the checkpoint. From …
-
Hello,
number of processors to use is either hardcoded (4, 8) either set using `multiprocessing.cpu_count()`
problem is that `multiprocessing.cpu_count()` returns the number of available cpu, …
-
### Bug description
## Description
I'm training a model based on number of iterations instead of a number of epochs. The same model trains on datasets of different sizes, hence one epoch differ…
-
When I finetune llama7b:
```
# alpaca
torchrun --nproc_per_node=8 --master_port=29000 train.py \
--model_name_or_path .cache/hub/models--meta-llama--Llama-2-7b-hf/snapshots/01c7f73d771dfac7d…
-
hi, I use the image and code in the paper, but cannot reproduce the results, here is the train and infer details:
```
accelerate launch train_dreambooth_b-lora_sdxl.py \
--pretrained_model_name…