llm-training Search Results

1000+ results
for llm-training

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

PygmalionAI/aphrodite-engine #792

[New Method]: VPTQ, Vector Post-Training Quantization

### The quantization format Hi all, We have recently designed and open-sourced a new method for Vector Quantization called Vector Post-Training Quantization (VPTQ). Our work is available at [VPTQ…

YangWang92 updated 3 weeks ago
2
FlagOpen/FlagEmbedding #1213

Question about bge-en-icl paper

Dear Authors, Firstly, thank you for your great work, "Making Text Embedder Few-Shot Learners". It was very interesting to see how you improved the performance of text embedding by leveraging the …

whybe-choi updated 10 hours ago
1
Open3DA/LL3DA #30

About training time&memory

Hi, thanks for your great work. May I ask what the training time and memory usage are when using the 7B parameter LLM? Looking forward to your reply.

hanxunyu updated 4 weeks ago
1
NVIDIA/TensorRT-LLM #2376

ModelRunner cannot start engine with "multi-rank nemo LoRA" …

Hi, I try to use my LoRA weights that I got from PEFT using NeMo Framework container (with tp=pp=2, 4 gpus) with TensorRT-LLM ModelRunner (the [TensorRT-LLM/examples/run.py](https://github.com/NVIDIA…

jolyons123 updated 3 days ago
2
Aidenzich/road-to-master #66

Let Me Speak Freely? A Study on the Impact of Format Restric…

# Key Observation - LLMs exhibit a significant decline in reasoning abilities when subjected to strict format restrictions. - The stricter the format, the greater the performance degradation in rea…

Aidenzich updated 5 days ago
3
kubeflow/training-operator #2212

KEP-2170: Create LLM training runtime for Llama 3.1 8B

Related: https://github.com/kubeflow/training-operator/issues/2170 Once we implement storage initializers, trainers, and controllers, we should add the LLM training runtimes. We can start with run…

andreyvelich updated 3 days ago
5
facebookresearch/spiritlm #15

Would consider release the training code?

Would consider release the training code?

luohao123 updated 3 weeks ago
3
hpcaitech/ColossalAI #6047

[FEATURE]: Is it Possible to integrate Liger-Kernel?

### Describe the feature https://github.com/linkedin/Liger-Kernel Liger Kernel is a collection of Triton kernels designed specifically for LLM training. It can effectively increase multi-GPU train…

ericxsun updated 5 days ago
7
haesleinhuepf/human-eval-bia #89

LLM's using this repo as training data

Hi, @haesleinhuepf recommended I post this here. Because you guys are making this evaluation benchmark public, couldn't LLMs use this as training data and therefore overfit to it? So, it will be h…

JoOkuma updated 2 months ago
6
NVIDIA/TensorRT-LLM #2288

need a copy code widget to be able to copy code snippets

Is it possible to add to https://nvidia.github.io/TensorRT-LLM/ the code copy widget that you already have on https://nvidia.github.io/TensorRT-Model-Optimizer/? For example if you go to https://nvidi…

stas00 updated 1 week ago
3

上一页 1...2 3 4 5 6 7 8...100 下一页

1000+ results for llm-training

1000+ results
for llm-training