llm-training Search Results

1000+ results
for llm-training

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

AkihikoWatanabe/paper_notes #679

Do LLMs Understand User Preferences? Evaluating LLMs On User…

# URL - https://arxiv.org/abs/2305.06474 # Affiliations - Wang-Cheng Kang, N/A - Jianmo Ni, N/A - Nikhil Mehta, N/A - Maheswaran Sathiamoorthy, N/A - Lichan Hong, N/A - Ed Chi, N/A - Der…

AkihikoWatanabe updated 1 year ago
1
NVIDIA/TransformerEngine #1132

RMSNorm precision different from HF implementation

We noticed there's a tiny implementation difference that makes `transformer_engine.pytorch.module.rmsnorm` and also `TELayerNormColumnParallelLinear` generate results from HF implementation. And th…

void-main updated 2 months ago
5
microsoft/LLM2CLIP #12

Some issues about the reproduction

Hello! I am very interested in your work, and I encountered some issues during the reproduction process. - How can I replace the original text encoder with the tuned Llama 3 model? I checked the co…

forg77 updated 1 week ago
5
XpressAI/xai-llm-server #2

Feature Request: Add support for Llama-3.2-11B-vision/

### Problem We want to add support for this new model that unlike the previous ones also supports vision. The readme for the model is described below: --- language: - en - de - fr - it - pt…

wmeddie updated 1 month ago
3
huggingface/transformers #26706

Add an option to decide whether to store the checkpoint and …

**Motivation:** Currently, when using the Transformers library in combination with DeepSpeed for training large language models like LLMs, checkpoints (e.g. `bf16_zero_pp_rank_0_mp_rank_00_optim_stat…

timturing updated 1 year ago
7
junhwi/next-gen-ai #32

24/07/07

https://www.microsoft.com/en-us/research/blog/graphrag-new-tool-for-complex-data-discovery-now-on-github/ Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems https://arxiv.org/…

junhwi updated 4 months ago
2
meta-introspector/meta-meme #200

Drunken Walk Generator and Lattice for Deep GNN Training

# Drunken Walk Generator and Lattice for Deep GNN Training ## Overview This page documents the development of a generator function and a lattice structure designed to encompass a broad range of inte…

jmikedupont2 updated 3 months ago
5
INTERSECT-training/software-licensing #7

Add discussion of licensing concerns with LLMs

Based on discussion with David, came up with 3 phases where licensing has important consequences with using LLMs (e.g. copilot) with software development: - What are the consequences of ingesting cod…

troycomi updated 7 months ago
1
unslothai/unsloth #36

Multigpu

Is there multigpu support ? Don't know how to set up without running a script

drewskidang updated 1 month ago
10
confident-ai/deepeval #262

Add JudgeLM as a way to evaluate and compare historical test…

Currently, metrics are computed based on test cases that run during evaluation. However, there's currently no way to compare historical test runs' performances except for comparing metric scores for e…

penguine-ip updated 1 year ago
3

上一页 1...91 92 93 94 95 96 97...100 下一页

1000+ results for llm-training

1000+ results
for llm-training