young-geng EasyLM issues

young-geng / EasyLM

Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.

Apache License 2.0

2.38k stars 254 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Too small initializer variance

#115 LeoXinhaoLee opened 1 month ago
0
Unusual Initializer Range default

#114 evanatyourservice closed 1 month ago
3
Llama3 dev

#113 young-geng closed 4 months ago
0
Dataset for the paper: The False Promise of Imitating Proprietary LLMs (https://arxiv.org/pdf/2305.15717)

#112 manandey opened 4 months ago
0
[Bug] Error in Evaluation

#111 LeoXinhaoLee opened 6 months ago
0
Llama 7b Pretraining Dtype

#110 LeoXinhaoLee opened 6 months ago
0
ERROR: Accessing retired flag 'jax_enable_async_collective_offload'

#109 LeoXinhaoLee opened 6 months ago
1
How does `accumulate_gradient_steps` work?

#108 VictorSanh opened 7 months ago
0
Multi host TPU training

#107 kathir-ks closed 3 months ago
0
Fixed a lot of grammar/spelling mistakes in all the documentation files.

#106 starblasters8 closed 4 months ago
0
fix the option to support 'streaming=false'

#105 gystar opened 9 months ago
0
Mistral

#104 peregilk opened 10 months ago
5
Serving errors: deprecated dependencies and structure error

#103 sjw8793 opened 11 months ago
2
TPU specific flags produce errors

#102 sjw8793 closed 11 months ago
2
why 'LLaMATokenizer' object has no attribute 'sp_model'?

#101 zepen opened 11 months ago
3
Conflicting dependencies for jax[cuda11-pip]==0.4.14

#100 gbacon opened 11 months ago
0
Does EasyLM work for LLama-2?

#99 Eichhof opened 11 months ago
0
OOM trying to pretrain llama 7b on v4-256

#98 redbrain opened 12 months ago
7
Missing config for Open LLaMA 3B

#97 jcole75 opened 1 year ago
1
fix: fix bug not found config

#96 pphuc25 closed 2 months ago
1
fix bug on config convert_checkpoint

#95 pphuc25 closed 1 year ago
1
What is the full batch size if mesh_dim is set to 1,1,-1, on TPUv3-8?

#94 TPFRL opened 1 year ago
3
Are you guys planning to implement GQA?

#93 Taekyoon closed 1 year ago
1
pip conflict when run `conda env create -f scripts/gpu_environment.yml`

#92 0x00-pl closed 9 months ago
0
Added back optimizer

#91 forhaoliu closed 1 year ago
0
added BPT for training very long sequence

#90 forhaoliu closed 1 year ago
1
Converting the Koala Weights to HF Transformers

#89 xiujiesong closed 1 year ago
1
FIx convert script

#88 superMDguy closed 1 year ago
0
how to train llama2-7b in a100 80G gpu?

#87 brewswang opened 1 year ago
3
Questions about inference mode

#86 Tengfei09 opened 1 year ago
2
Fix missing backslash typo in pretrain_llama_7b.sh

#85 akhilkedia closed 1 year ago
0
TPU Installation broken because of change in Orbax

#84 akhilkedia opened 1 year ago
1
Is there a plan to support training with fp8?

#83 joytianya closed 1 year ago
2
Fix omitted dtype params in efficient memory attention function.

#82 Taekyoon closed 1 year ago
1
Flash attention does not make memory efficient

#81 Taekyoon closed 1 year ago
13
Feature request: Use Orbax for checkpointing.

#80 OhadRubin opened 1 year ago
2
LlaMa Pretraining in A100 80G

#79 mohammadaminabbasi opened 1 year ago
1
Use EasyLM to pre-train llama-7B using Nvidia GPU

#78 zhpacer opened 1 year ago
2
LLaMA 2 support for pre-training

#77 philschmid opened 1 year ago
6
Pinned fastapi version

#76 dcerisano opened 1 year ago
1
Config file to train openllama 7B v2

#73 yhcc closed 1 year ago
3
use streaming_train_state to convert_easylm_to_hf find opt_state.1.0.count

#72 xzqxnet0990 closed 1 year ago
4
What precision strategy is used in pre-training OpenLlama?

#71 haozhouamzn closed 1 year ago
3
example fine-tune alpaca

#70 ehartford opened 1 year ago
1
Anyone tries to train with gpt-j?

#69 Taekyoon closed 1 year ago
2
HF->EasyLM checkpoint conversion

#68 syzymon closed 1 year ago
1
Citation

#67 syzymon closed 1 year ago
1
Bug in llamda training?

#66 Dali660 closed 1 year ago
1
What is the logic behind the partitions?

#65 gianlucadetommaso closed 1 year ago
4
fix typos in README

#64 gianlucadetommaso closed 1 year ago
0