issues
search
young-geng
/
EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
Apache License 2.0
2.38k
stars
254
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Too small initializer variance
#115
LeoXinhaoLee
opened
1 month ago
0
Unusual Initializer Range default
#114
evanatyourservice
closed
1 month ago
3
Llama3 dev
#113
young-geng
closed
4 months ago
0
Dataset for the paper: The False Promise of Imitating Proprietary LLMs (https://arxiv.org/pdf/2305.15717)
#112
manandey
opened
4 months ago
0
[Bug] Error in Evaluation
#111
LeoXinhaoLee
opened
6 months ago
0
Llama 7b Pretraining Dtype
#110
LeoXinhaoLee
opened
6 months ago
0
ERROR: Accessing retired flag 'jax_enable_async_collective_offload'
#109
LeoXinhaoLee
opened
6 months ago
1
How does `accumulate_gradient_steps` work?
#108
VictorSanh
opened
7 months ago
0
Multi host TPU training
#107
kathir-ks
closed
3 months ago
0
Fixed a lot of grammar/spelling mistakes in all the documentation files.
#106
starblasters8
closed
4 months ago
0
fix the option to support 'streaming=false'
#105
gystar
opened
9 months ago
0
Mistral
#104
peregilk
opened
10 months ago
5
Serving errors: deprecated dependencies and structure error
#103
sjw8793
opened
11 months ago
2
TPU specific flags produce errors
#102
sjw8793
closed
11 months ago
2
why 'LLaMATokenizer' object has no attribute 'sp_model'?
#101
zepen
opened
11 months ago
3
Conflicting dependencies for jax[cuda11-pip]==0.4.14
#100
gbacon
opened
11 months ago
0
Does EasyLM work for LLama-2?
#99
Eichhof
opened
11 months ago
0
OOM trying to pretrain llama 7b on v4-256
#98
redbrain
opened
12 months ago
7
Missing config for Open LLaMA 3B
#97
jcole75
opened
1 year ago
1
fix: fix bug not found config
#96
pphuc25
closed
2 months ago
1
fix bug on config convert_checkpoint
#95
pphuc25
closed
1 year ago
1
What is the full batch size if mesh_dim is set to 1,1,-1, on TPUv3-8?
#94
TPFRL
opened
1 year ago
3
Are you guys planning to implement GQA?
#93
Taekyoon
closed
1 year ago
1
pip conflict when run `conda env create -f scripts/gpu_environment.yml`
#92
0x00-pl
closed
9 months ago
0
Added back optimizer
#91
forhaoliu
closed
1 year ago
0
added BPT for training very long sequence
#90
forhaoliu
closed
1 year ago
1
Converting the Koala Weights to HF Transformers
#89
xiujiesong
closed
1 year ago
1
FIx convert script
#88
superMDguy
closed
1 year ago
0
how to train llama2-7b in a100 80G gpu?
#87
brewswang
opened
1 year ago
3
Questions about inference mode
#86
Tengfei09
opened
1 year ago
2
Fix missing backslash typo in pretrain_llama_7b.sh
#85
akhilkedia
closed
1 year ago
0
TPU Installation broken because of change in Orbax
#84
akhilkedia
opened
1 year ago
1
Is there a plan to support training with fp8?
#83
joytianya
closed
1 year ago
2
Fix omitted dtype params in efficient memory attention function.
#82
Taekyoon
closed
1 year ago
1
Flash attention does not make memory efficient
#81
Taekyoon
closed
1 year ago
13
Feature request: Use Orbax for checkpointing.
#80
OhadRubin
opened
1 year ago
2
LlaMa Pretraining in A100 80G
#79
mohammadaminabbasi
opened
1 year ago
1
Use EasyLM to pre-train llama-7B using Nvidia GPU
#78
zhpacer
opened
1 year ago
2
LLaMA 2 support for pre-training
#77
philschmid
opened
1 year ago
6
Pinned fastapi version
#76
dcerisano
opened
1 year ago
1
Config file to train openllama 7B v2
#73
yhcc
closed
1 year ago
3
use streaming_train_state to convert_easylm_to_hf find opt_state.1.0.count
#72
xzqxnet0990
closed
1 year ago
4
What precision strategy is used in pre-training OpenLlama?
#71
haozhouamzn
closed
1 year ago
3
example fine-tune alpaca
#70
ehartford
opened
1 year ago
1
Anyone tries to train with gpt-j?
#69
Taekyoon
closed
1 year ago
2
HF->EasyLM checkpoint conversion
#68
syzymon
closed
1 year ago
1
Citation
#67
syzymon
closed
1 year ago
1
Bug in llamda training?
#66
Dali660
closed
1 year ago
1
What is the logic behind the partitions?
#65
gianlucadetommaso
closed
1 year ago
4
fix typos in README
#64
gianlucadetommaso
closed
1 year ago
0
Next