llm-training Search Results

1000+ results
for llm-training

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

PaddlePaddle/PaddleNLP #9386

[Bug]: 跑llama3-8b的sft微调时，报错 KeyError: 'eval_accuracy'

### 软件环境 ```Markdown - paddlepaddle-gpu: 0.0.0.post120 - paddlenlp: 3.0.0b2 ``` ### 重复问题 - [X] I have searched the existing issues ### 错误描述 ```Markdown 跑llama3-8b的sft微调时，报错 Traceback (most r…

hjx620 updated 2 weeks ago
1
stanford-futuredata/ARES #72

Difference between Answer Relevance and Answer Faithfulness

Hello, I was wondering whether there is any difference between Answer Relevance and Answer Faithfulness. Conceptually there is of course, but the code for training LLM judges and actually judging s…

WJ44 updated 1 month ago
1
EleutherAI/gpt-neox #1321

Can `preprocess_data.py` support Huggingface Dataset?

Since there are many datasets in the format of Huggingface datasets, it would be convenient if `preprocess_data.py` can directly preprocess and tokenize from HF datasets.

cafeii updated 2 days ago
1
hiyouga/LLaMA-Factory #5303

adam-mini is not compatible with deepspeed

### Reminder - [X] I have read the README and searched the existing issues. ### System Info when I just add one line in the `examples/extras/adam_mini/qwen2_full_sft.yaml` got a error below. ```…

muziyongshixin updated 1 month ago
3
mozilla/translations #766

Integrate datasets used for LLM training as monolingual data…

For example https://huggingface.co/datasets/ontocord/CulturaY.

marco-c updated 2 months ago
2
modelscope/ms-swift #2122

Inference and fine-tuning support for GOT-OCR2.

**Inference:** ```bash CUDA_VISIBLE_DEVICES=0 swift infer --model_type got-ocr2 --model_id_or_path stepfun-ai/GOT-OCR2_0 ``` ```

Jintao-Huang updated 2 days ago
32
tenstorrent/tt-metal #13835

[Feature Request] Data parallel training support

**Is your feature request related to a problem? Please describe.** I need to use CCL to send neural network weights from one device to another without using host. Also we need to have all_reduce sup…

dmakoviichuk-tt updated 3 weeks ago
1
microsoft/DeepSpeed #4776

Deepspeed fails with frozen weights (e.g. only train llama2 …

**Describe the bug** This bug is similar to #4055 , I provide a repro here. **To Reproduce** Please put these three files in the same directory (remember to change the first two `.txt -> .py` and…

rucnyz updated 1 month ago
2
OpenGVLab/InternVL #612

KeyError: 'architectures'

### Checklist - [ ] 1. I have searched related issues but cannot get the expected help. - [ ] 2. The bug has not been fixed in the latest version. - [ ] 3. Please note that if the bug-related issue y…

CachCheng updated 1 month ago
5
AkihikoWatanabe/paper_notes #1484

Beyond Accuracy: Evaluating the Reasoning Behavior of Large …

# URL - https://arxiv.org/abs/2404.01869 # Authors - Philipp Mondorf - Barbara Plank # Abstract - Large language models (LLMs) have recently shown impressive performance on tasks involving reaso…

AkihikoWatanabe updated 2 weeks ago
2

上一页 1...6 7 8 9 10 11 12...100 下一页

1000+ results for llm-training

1000+ results
for llm-training