gradient-accumulation Search Results

1000+ results
for gradient-accumulation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

karpathy/nanoGPT #182

What MFU score is to be expected?

Hello, The training outputs the model flops utilization (MFU), which is quite low on my card (like 7-8%). Does anyone know what score is to be expected? I don't have an A100 readily available to t…

yohan-pg updated 5 months ago
6
bmaltais/kohya_ss #2887

[Bug] `subprocess.CalledProcessError` occurred while running…

### Issue Content: **Description:** It usually happens during lora training for some time. I encountered a `subprocess.CalledProcessError` when running the `train_network.py` script using the …

baicai99 updated 4 weeks ago
1
hustvl/YOLOS #27

CUDA Out of Memory Errors w Batch Size of 1 on 16GB V100

Using the default FeatureExtractor settings for the HuggingFace port of YOLOS, I am consistently running into CUDA OOM errors on a 16GB V100 (even with a training batch size of 1). I would like to …

jordanparker6 updated 2 years ago
4
north-road/qgis-processing-saga-nextgen #21

Flow Accumulation(one step) analysis

QGIS version: 3.22.5-Białowieża QGIS code revision: c2723178 Qt version: 5.15.2 Python version: 3.9.5 GDAL version: 3.4.1 GEOS version: 3.10.2-CAPI-1.16.0 PROJ version: Rel. 8.2.1, January 1st, …

ARTCON2020 updated 2 years ago
1
redotvideo/haven #82

Llamatune fails with your example code from its home page

steps to reproduce 1) start a runpod container with the pytorch 2.01 template and lots of disk space 2) run your sample command on a properly formatted dataset: python -m llamatune.train \ --m…

IridiumMaster updated 1 year ago
2
thunlp/OpenMatch #46

复现Roberta-Large和ELECTRA-Large的问题

我使用的环境是 pytorch 1.4.0 transformers 2.8.0 参照着文档https://github.com/thunlp/OpenMatch/blob/master/docs/experiments-msmarco.md 中的训练命令 ``` CUDA_VISIBLE_DEVICES=0 \ python train.py\ -task r…

yiyaxiaozhi updated 2 years ago
1
ludwig-ai/ludwig #3787

"Encounted `nan` values in tensor. Will be removed.", UserWa…

Hi team, I was fine tuning an LLM with Ludwig on a **NVIDIA A 100** instance. I get the error message - **Encounted `nan` values in tensor. Will be removed.", UserWarning)** My loss and perplexi…

msmmpts updated 2 weeks ago
3
tumurzakov/AnimateDiff #7

training issues

``` 07/31/2023 19:11:36 - INFO - __main__ - Distributed environment: NO Num processes: 1 Process index: 0 Local process index: 0 Device: cuda Mixed precision type: fp16 {'prediction_type', '…

Cubey42 updated 1 year ago
54
yu-rp/Distribution-Shift-Iverson #3

Problem about cuda-out-of-memory

Hi, I want to reproduce your results via your provided codes. But I was stuck in the fine-tuning section. No matter how I reduce the batch size and input image size, it still says cuda out of memory.…

Chelsea-abab updated 11 months ago
3
microsoft/DeepSpeed #3938

[BUG] model.load_checkpoint out of memory

### System Info ```Shell accelerate 0.20.3 python 3.10 numpy 1.24.3 torch 2.0.1 accelerate config: compute_environment: LOCAL_MACHINE deepspeed_config: deepspeed_multinode_launcher: stand…

jiangix-paper updated 5 months ago
7

上一页 1...82 83 84 85 86 87 88...100 下一页

1000+ results for gradient-accumulation

1000+ results
for gradient-accumulation