llm-training Search Results

1000+ results
for llm-training

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

mlcommons/training_results_v4.0 #7

Cuda error when reproducing Llama2 70B finetune

Hello, I'm currently trying to reproduce NVIDIA Llama2 70B results on DGX H100. I applied fixes from https://github.com/mlcommons/training_results_v4.0/issues/5, but face the CUDA issue: ``` Failed:…

asesorov updated 1 month ago
1
pytorch/ao #633

[RFC] Add LayerSkip to AO

Tracker issue for adding [LayerSkip](https://arxiv.org/abs/2404.16710) to AO. This is a training and inference optimization that is similar to layer-wise pruning. It's particularly interesting for…

jcaip updated 3 months ago
3
TsinghuaC3I/CoGenesis #1

when loading the fine-tuned smaller model , an error happens…

您好！我是哈尔滨工业大学的一名学生，最近正尝试复刻您关于CoGenesis的工作。我遇到了一些棘手的麻烦，希望得到您的帮助。问题如下：在“基于草稿的方法”下，加载微调过的小模型时出现了如下的报错：**ValueError: Trying to set a tensor of shape torch.Size([311164928]) in "weight" (which has shape t…

FireCaramelPudding updated 3 weeks ago
2
hiyouga/LLaMA-Factory #5681

AttributeError: 'bool' object has no attribute '_pad'

### Reminder - [X] I have read the README and searched the existing issues. ### System Info (MindSpore) [root@fd428729b7cb46b089e3705e66eecb16-task0-0 LLaMA-Factory]# llamafactory-cli train example…

garry-jay updated 1 month ago
1
vllm-project/vllm #4416

[Bug]: cannot load model back due to [does not appear to hav…

### Your current environment ```text The output of `python collect_env.py` ``` ### 🐛 Describe the bug hi there, > vllm version: 0.4.1 I fine-tuned the mistral-7b-v0.2 model using the tr…

yananchen1989 updated 2 months ago
6
zylon-ai/private-gpt #2109

[BUG] GUI not working

The GUI is not working. (gpt1) ashu@MSI:/mnt/c/Users/genco/Documents/gpt$ make run poetry run python -m private_gpt 11:47:15.734 [INFO ] private_gpt.settings.settings_loader - Starting appli…

ashunaveed updated 3 weeks ago
1
ggerganov/llama.cpp #9761

Bug: Row Split Mode - Segmentation fault after model load on…

### What happened? I am running on Rocm with 4 x Instinct MI100. Only when using `--split-mode row` mode I get a Address boundary error. llama.cpp was working when I had a XGMI GPU Bridge working w…

thamwangjun updated 1 week ago
14
ollama/ollama #7267

Running out of memory when allocating to second GPU

### What is the issue? No issues with any model that fits into a single 3090 but seems to run out of memory when trying to distribute to the second 3090. ``` INFO [wmain] starting c++ runner | ti…

joshuakoh1 updated 1 month ago
5
MAGICS-LAB/DNABERT_S #24

Multiple training errors in the pre-training code

Hi, I found that there exist several errors in the pre-training code (the file run.sh) and corresponding code. I have mentioned one in the pull request.Furthermore, it seems that we should use $PATH_T…

HelloWorldLTY updated 1 month ago
3
awslabs/open-data-registry #2374

Clarification of the terms of use for the Software Heritage …

This issue concerns the dataset description at https://github.com/awslabs/open-data-registry/blob/main/datasets/software-heritage.yaml Due to the recent rise in the quest of data for LLM training, we…

rdicosmo updated 1 month ago
2

上一页 1...11 12 13 14 15 16 17...100 下一页

1000+ results for llm-training

1000+ results
for llm-training