llm-training Search Results

1000+ results
for llm-training

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

manisnesan/fastchai #52

Kaggle LLM Science Exam

- [x] [Jeremy Twitter thread](https://twitter.com/jeremyphoward/status/1688673397138690048?s=46&t=aOEVGBVv9ICQLUYL4fQHlQ) - Training set 200 science multiple choice questions autogenerated using GPT …

manisnesan updated 1 year ago
8
GanjinZero/RRHF #13

This loss seems to consume a lot of memory.

The idea of this paper is really great and much easier to understand than ppo. However, if there are six candidate responses, then at least batch size should be equal to 6 when calculating loss once.…

piekey1994 updated 1 year ago
4
huggingface/transformers #27609

Extend Chat Template Tokenization for Training/Finetuning

### Feature request Extend `tokenizer.apply_chat_template` with functionality for training/finetuning, returning `attention_masks` and (optional) `labels` (for ignoring "System" and "User" messages d…

siddk updated 2 months ago
10
X-lab2017/open-perf #50

[Idea] 关于大模型 Benchmark 的相关工作

随着大模型的飞速发展，各种相关的 benchmark 也层出不穷，开此 issue 收集相关工作，以促进思考与后续可能的工作~ **1、BIG-bench（Google）** Jeff Dean 等人架构的 PaLM 模型中，同时提出了 BIG-Bench 大模型专用基准，与其他算法进行多项任务测试。 **2、、HELM（Stanford）** - 论文链接：https:/…

will-ww updated 1 year ago
2
AkihikoWatanabe/paper_notes #1371

Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinati…

# URL - https://arxiv.org/abs/2405.05904 # Affiliations - Zorik Gekhman, N/A - Gal Yona, N/A - Roee Aharoni, N/A - Matan Eyal, N/A - Amir Feder, N/A - Roi Reichart, N/A - Jonathan Herzig…

AkihikoWatanabe updated 1 month ago
2
lightvector/KataGo #903

Try Gemma.cpp to explain the meaning of each move

The integration of Gemma.cpp into KataGo can be used to help it explain the meaning of each move. is it possible ? Gemma: https://github.com/google/gemma.cpp

stephenlang84 updated 9 months ago
3
NVIDIA/TransformerEngine #391

[Feature Request] Please support pytorch and Jax FP8 type wi…

Torch FP8 data type may be released at version 2.1, and Jax FP8 supported has already being released.

MoFHeka updated 1 year ago
3
manisnesan/til #70

Multilingual Models

https://www.sarvam.ai/blog/announcing-openhathi-series - Bilingual LLMs frugally > The OpenHathi series of work at Sarvam AI is to make contributions to the ecosystem with open models and datasets to…

manisnesan updated 11 months ago
1
TencentARC/ST-LLM #17

question about training recipe

Which configuration file can reproduce the 54.x effect of the paper?

Nastu-Ho updated 4 months ago
3
WasmEdge/WasmEdge #3495

LFX Workspace: finetuning LLMs for Rust learning

### Summary # Motivation WasmEdge is a lightweight inference runtime for AI and LLM applications. Build specialized and finetuned models for WasmEdge community. The model should be supported by Wa…

codemaster1104 updated 2 months ago
46

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for llm-training

1000+ results
for llm-training