llm-training Search Results

1000+ results
for llm-training

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/text-embeddings-inference #261

Support gte-Qwen1.5-7B-instruct

### Model description Here is the model description > gte-Qwen1.5-7B-instruct is the latest addition to the gte embedding family. This model has been engineered starting from the [Qwen1.5-7B](https:…

reverland updated 2 days ago
1
ggerganov/llama.cpp #6994

Segmentation fault on finetune with -ngl > 0, Debian 12 stab…

Specs: rtx 3060ti w/ 8gb vram, r7 5700x, 32gb ram main says `main: build = 2769 (8843a98c) main: built with cc (Debian 12.2.0-14) 12.2.0 for x86_64-linux-gnu` make says `GNU Make 4.3 Built…

Basiliotornado updated 2 days ago
2
huggingface/alignment-handbook #172

Question on "mlm" in continued pre-training

Hi Team, It is amazing handbook. In the continued pre-training script (`run_cpt.py`), I saw that it is not using "mlm" (Masked Language Model) parameter in the training process. I though that the …

tanliboy updated 2 weeks ago
2
intelligent-machine-learning/dlrover #1005

We are going to build a LLM training agent help searching tr…

hxdtest updated 1 month ago
1
pytorch/pytorch #128706

does FSDP support AMSP (a new DP shard strategy)

### 🚀 The feature, motivation and pitch there's a new DP shard strategy which is more flexible and general, see more detail at https://arxiv.org/abs/2311.00257 AMSP: Reducing Communication Overhead o…

guoyejun updated 2 weeks ago
2
paperswithlove/papers-we-read #4

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-tr…

![image](https://github.com/paperswithlove/papers-we-read/assets/12858045/20087322-d388-45db-b0ed-2daab0ea5baf) [https://arxiv.org/abs/2403.09611](https://arxiv.org/abs/2403.09611) - 아니 애플에서 MLL…

runhani updated 3 months ago
7
rmusab/vul-llm-finetune #2

The fine-tuned WizardCoder model

Dear authors, Thanks for your work! I am interested in applying it in my study. I wonder could you provide the fine-tuned WizardCoder model file, which could be ready-for-use. Or could you pleas…

Yuuoniy updated 1 day ago
1
pytorch/torchtune #869

Understanding QLora memory consumption for inference

Hello, I have a question regarding GPU memory consumption during inference. Before finetuning a model with QLora, the torchtune.LoRALinear modules will convert the original LLM weights to nf4, a…

Optimox updated 2 months ago
5
leon-ai/leon #529

Would GPT4All integration provide a performance improvement?

In the demos I’ve seen of Leon AI, it appeared rather slow. I have no idea if this was a limitation of the hardware or there were inefficiencies that might be improved upon. [GPT4All](https://github.c…

loren-osborn updated 3 weeks ago
2
kubeflow/training-operator #2101

Export Fine-Tuned LLM after Trainer is Complete

We discussed here: https://github.com/kubeflow/website/pull/3718#issuecomment-2096619898 that [our LLM Trainer](https://github.com/kubeflow/training-operator/blob/bb8bba00ff0b48de922c523b0d3051f8b2d4e…

andreyvelich updated 1 month ago
3

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for llm-training

1000+ results
for llm-training