batch-scheduler Search Results

1000+ results
for batch-scheduler

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/DeepSpeedExamples #861

RLHF problems when using Qwen model

I'm trying to use DeepSpeed-Chat stage2 scripts to do rlhf with Qwen1.8b-chat model，I change some parts in dschat and main.py to load my model, the most different part is: ``` if 'Qwen' in model_nam…

128Ghe980 updated 7 months ago
1
lm-sys/FastChat #1249

how to load the ckpt of the Lora training?

Hi, I run the following Lora training script: ``` deepspeed fastchat/train/train_lora.py \ --deepspeed configs/deepspeed_zero3.json \ --lora_r 8 \ --lora_alpha 16 \ --lora_drop…

Wangpeiyi9979 updated 1 year ago
4
ecmwf-ifs/loki #253

Repository reorganisation

As we are maturing the outward-facing components of Loki, we have been contemplating a slight repository reorganisation to better encapsulate the different layers of the API and organise the re-use wi…

mlange05 updated 6 months ago
3
dvlab-research/LLaMA-VID #60

Zero-3 offload support

Is there a way to enable zero3-offload for LLaMA-VID? I'm trying to integrate a LLM with higher GPU RAM usage to LLaMA-VID, which means I can't run it without offloading to RAM, even at batch_size=…

XenonLamb updated 7 months ago
5
haotian-liu/LLaVA #1103

[Usage] TypeError: LlavaLlamaForCausalLM.__init__() got an u…

### Describe the issue Issue: Getting an error when trying to finetune the LLaVA-v1.6-34b Command: ``` PASTE THE COMMANDS HERE. ``` #!/bin/bash deepspeed LLaVA/llava/train/train_mem.py \ …

ThugJudy updated 2 months ago
13
bougui505/alignscape #2

limit on MSA length?

hi i am getting following error. is there some limit on protein length? n_input: 985 opening seq.aln cuda:0 batch_size: 10 sigma: 22.5 alpha: 0.5 seq.aln opened with object id 138143134556656…

Yogesh1-11 updated 8 months ago
3
GXYM/DRRG #16

RuntimeError: result type Long can't be cast to the desired …

Hello there! I am trying to train this model by running this code: !sh train_CTW1500.sh in google Colab: but I get this error in my zero epoch: load the vgg16 weight from ./cache Start tra…

Mary63 updated 2 years ago
8
woocommerce/action-scheduler #937

CLI runner time limit

We would like to limit the running time of the CLI runner, similar to how the `action_scheduler_queue_runner_time_limit` filter is working for the default runner, but we could not find a way to do thi…

GrigoreMihai updated 1 year ago
3
lm-sys/FastChat #522

AssertionError: No inf checks were recorded for this optimiz…

While tuning I am getting the following error. AssertionError: No inf checks were recorded for this optimizer. Can anyone help me with this? Here are my training arguments: per_device_train_batc…

samarthsarin updated 1 year ago
3
cavalleria/cavaface #69

Training Accuracy is Wrong but Validation Accuracy is Corren…

I used your code with AMP FP16 from pytorch 1.6. I achieved a good accuracy on validation set but showing the training accuracy is wrong. Do you have any suggestion to fix it? @xsacha @cavalleria . Th…

John1231983 updated 3 years ago
6

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for batch-scheduler

1000+ results
for batch-scheduler