llm-training Search Results

1000+ results
for llm-training

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

kubeflow/mpi-operator #611

Work with DeepSpeed for large scale training

[DeepSpeed](https://github.com/microsoft/DeepSpeed) is an excellent framework for training LLMs on a large scale, while the mpi-operator is the ideal tool to facilitate this within the Kubernetes ecos…

kuizhiqing updated 11 months ago
28
huggingface/transformers #26706

Add an option to decide whether to store the checkpoint and …

**Motivation:** Currently, when using the Transformers library in combination with DeepSpeed for training large language models like LLMs, checkpoints (e.g. `bf16_zero_pp_rank_0_mp_rank_00_optim_stat…

timturing updated 1 year ago
7
opening-up-chatgpt/opening-up-chatgpt.github.io #88

Improve YAML format by including assessment date & model ver…

With the proliferation of models and model variants it becomes more important to track assessment dates and model versions. So far we've been able to treat model families as one, because it rarely …

mdingemanse updated 6 months ago
2
sugarlabs/musicblocks #3854

Musical creation and transcription assistance via generative…

### Ticket Contents ## Description Many people have musical ideas, but struggle to articulate them. Generative AI has promise to help people find a way to transcribe their musical ideas. The goal …

walterbender updated 7 months ago
8
shufangxun/LLaVA-MoD #5

CUDA OOM issues

Hello, I've been trying to qwen2 0.5B and tinyclip using the repository, but I'm running into CUDA OOM issues on the dense2dense distillation step. Im running on 4 80GB A100s, I was wondering if I …

pumetu updated 2 weeks ago
3
irthomasthomas/llm-reranker #1

Plugin design

The inspiration for this is: https://github.com/irthomasthomas/undecidability/issues/934 https://github.com/AnswerDotAI/rerankers

ShellLM updated 3 weeks ago
1
pytorch/pytorch #103962

How to unwrap after auto_wrap in FSDP?

I am currently fine-tuning a LLM (LLaMA) and would like to retrieve the gradients of each weight (parameter) after every gradient update. However, I notice that weights are (auto) wrapped into stuff l…

ZN1010 updated 1 year ago
3
Future-House/paper-qa #582

paperqa.Settings does not use local llm

I deployed Qwen2.5-14B-Instruct on my local server and started llm correctly using vllm. But when I executed the sample code, ``` from paperqa import Settings, ask local_llm_config = dict( …

loilisxka updated 1 month ago
27
RUCKBReasoning/TableLLM #16

Error in fine-tuning

Hello, Tensor assertion error is raised if you try to train the model. It starts with the following: ```bash 0%| | 0/10 [00:00

b5y updated 2 months ago
2
AkihikoWatanabe/paper_notes #769

Full Parameter Fine-tuning for Large Language Models with Li…

# URL - https://arxiv.org/abs/2306.09782 # Affiliations - Kai Lv, N/A - Yuqing Yang, N/A - Tengxiao Liu, N/A - Qinghui Gao, N/A - Qipeng Guo, N/A - Xipeng Qiu, N/A # Abstract - Large Lan…

AkihikoWatanabe updated 1 year ago
1

上一页 1...89 90 91 92 93 94 95...100 下一页

1000+ results for llm-training

1000+ results
for llm-training