llm-training Search Results

1000+ results
for llm-training

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

autogluon/autogluon #4082

[AutoMM] Enhancing Multi-GPU Support in Multimodal Training …

## Description: In AutoGluon's multimodal framework, Distributed Data Parallel (DDP) is the primary strategy employed for leveraging multiple GPUs across most problem types. A known limitation of D…

FANGAreNotGnu updated 7 months ago
1
X-PLUG/mPLUG-DocOwl #72

Finetuning Tinychart

Hello, I would like to fine-tune or train TinyChart to improve its summarization skills. I have the impression that it doesn't capture all the data during summarization, whereas it does during data co…

ViCtOr-dev13 updated 5 months ago
3
crewAIInc/crewAI #884

converting training_data to trained_agents_data error

Hi, Im studying CrewAI, i tried to create a crew to make docs about some code, when i try to use the new feature for train i got this error : Traceback (most recent call last): File "/home/well…

WellyngtonF updated 1 week ago
8
Lightning-AI/litgpt #1309

On the relationship with `torchtune`

Will it be possible in the future for you to coordinate with the [`torchtune`](https://pytorch.org/blog/torchtune-fine-tune-llms/) project so that we are able to use A for xyz and B for ikj? We've …

jwkirchenbauer updated 7 months ago
1
THUDM/VisualGLM-6B #139

AssertionError: fp32 param and grad have different shape tor…

使用zero2，在进行梯度更新时候，梯度的参数量级有9B左右，远远大于模型大小7B ``` ╭───────────────────── Traceback (most recent call last) ──────────────────────╮ │ /export/App/training_platform/PinoModel/applications/VisualGLM/vis…

zhangyuanscall updated 4 months ago
1
huggingface/trl #2338

RuntimeError: chunk expects at least a 1-dimensional tensor

### System Info Name: trl Version: 0.13.0.dev0 Name: transformers Version: 4.46.2 Python 3.11.10 ### Information - [X] The official example scripts - [X] My own modified scripts ### Tasks - […

imrankh46 updated 2 days ago
15
hpcaitech/ColossalAI #3566

[FEATURE]: Graphic card ram friendly PPO training for big mo…

### Describe the feature The PPO training needs to maintain 4 models in memory at the same time. The original implementation keep the reward/actor critic/initial model in video ram at the same time. …

yynil updated 1 year ago
1
irthomasthomas/undecidability #662

StarCoder2 and The Stack v2 from BigCode

- [ ] [blog/starcoder2.md at main · huggingface/blog](https://github.com/huggingface/blog/blob/main/starcoder2.md?plain=1) # blog/starcoder2.md at main · huggingface/blog --- ## StarCoder…

irthomasthomas updated 9 months ago
1
pyg-team/pytorch_geometric #9784

Empty Loss Tensor in G-Retriever Code Example

### 🐛 Describe the bug Hello, I tried to test the example related to the new G-Retriever model in colab: https://github.com/pyg-team/pytorch_geometric/blob/master/examples/llm/g_retriever.py. …

giuseppefutia updated 2 weeks ago
2
leon-ai/leon #529

Would GPT4All integration provide a performance improvement?

In the demos I’ve seen of Leon AI, it appeared rather slow. I have no idea if this was a limitation of the hardware or there were inefficiencies that might be improved upon. [GPT4All](https://github.c…

loren-osborn updated 5 months ago
2

上一页 1...85 86 87 88 89 90 91...100 下一页

1000+ results for llm-training

1000+ results
for llm-training