issues
search
young-geng
/
EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
Apache License 2.0
2.38k
stars
254
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Need help regarding Data Format for Fine Tuning
#63
RahulSundkar
closed
1 year ago
2
transformers version doesn't support Llama conversion to huggingface format
#62
ruiqi-zhong
closed
1 year ago
1
Advice/expectations on throughput
#61
float-trip
closed
1 year ago
2
When I train a 30-billion-parameter Llama model using V3-256, what configuration would be appropriate? I've tried '1, 64, 4', '1, 128, 2', and '1, 32, 8', but none of them worked."
#60
joytianya
closed
1 year ago
4
[Doc error]: Outdated doc for LLAMA
#59
llCurious
opened
1 year ago
1
Add LoRA support?
#58
Kimiko-AI
closed
1 year ago
1
Is there a plan to support Falcon?
#57
joytianya
closed
1 year ago
1
Supporting Falcon-40b-instruct
#56
innerop
closed
1 year ago
0
Recommended setup given a v4-512?
#55
OhadRubin
closed
1 year ago
1
Fix a typo in HuggingfaceDataset
#53
syzymon
closed
1 year ago
0
where is vocab file?
#52
lucasjinreal
closed
1 year ago
1
May I ask about the configs of pre-training? For example, did you use dropout?
#51
joytianya
closed
1 year ago
2
Fix unstable RMSNorm
#50
ZYHowell
closed
1 year ago
2
Model serving example
#49
congyingxia
closed
1 year ago
1
HF tokenizer taking too long to load
#48
basujindal
closed
1 year ago
1
How you handle the attention mask for dataset chunk?
#47
waterhorse1
closed
1 year ago
2
Do you provide an API ?
#46
prog-amateur2
closed
1 year ago
1
install so slow and any discord channel or group for communicate ?
#45
joostshao
closed
1 year ago
1
is it hard to support BLOOM?
#44
tiendung
closed
1 year ago
1
A detailed question on LLaMA training script.
#43
zhangzx-uiuc
closed
1 year ago
1
FSDP vs Model Parallelism
#42
jeromeku
closed
1 year ago
7
Because during training, I saw that the code added a starting character like <s> in the first position. Should we also add this character during inference to maintain consistency with training?
#41
joytianya
closed
1 year ago
4
If it is pre-training, can we just omit the [] directly?
#40
joytianya
closed
1 year ago
1
Avoid OOM in Llama train example and align with serve example
#39
juliensalinas
closed
1 year ago
2
Fix mesh_dim param in doc
#38
juliensalinas
closed
1 year ago
0
Does wandb_dir support GCP paths?
#37
joytianya
closed
1 year ago
1
optimizer.accumulate_ gradient_ steps Will the related changes to this configuration increase the usage of graphics memory?
#36
joytianya
closed
1 year ago
1
For 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work.
#35
joytianya
opened
1 year ago
2
Training OPT with Koala dataset
#34
Linohong
closed
1 year ago
3
When using fsdp=true, the error message is the same. Does this not have any effect?
#33
joytianya
closed
1 year ago
13
Koala hyperparameters + Running on TPU pod?
#32
hamishivi
closed
1 year ago
2
Code for fine-tuning?
#31
milsun
closed
1 year ago
2
When I increase the accumulate_gradient_steps, can the batch_size also be increased accordingly?
#30
joytianya
closed
1 year ago
1
truncated text?
#29
shikida
closed
1 year ago
0
Support for multi-host GPU training
#28
szxiangjn
closed
1 year ago
1
Support for Cerebras-GPT models?
#27
MasterScrat
opened
1 year ago
0
When I configured batch size 4 on v3-8, it was normal. But when I configured batch size 128 on v3-256, it reported OOM.What is the reason?
#26
joytianya
closed
1 year ago
3
Can save_checkpoint support writing to a GCS path?
#25
joytianya
closed
1 year ago
2
Corrected mistake in a shell command. Improved readability
#24
wthoutanymmries
closed
1 year ago
0
err converting to HF
#23
JMHAVANCE
closed
1 year ago
2
Is it normal for the learning rate to reach the peak_value and not decrease, but instead slowly rise?
#22
joytianya
closed
1 year ago
8
Checksum for recovered models?
#21
koreyou
closed
1 year ago
1
Conda install slow
#20
abdulfatir
closed
1 year ago
1
RAM requirements
#19
MichaelMartinez
opened
1 year ago
6
Get Error when Recovering the Koala Model Weights
#18
zhangsanfeng86
closed
1 year ago
4
difference between 13b v1 and v2 weight diffs + GPU requirements
#17
jpollard-cs
closed
1 year ago
1
Uploaded the weights to Huggingface Hub
#16
Logophoman
closed
1 year ago
1
Can the 'load_checkpoint' parameter support reading files from GCP?
#15
joytianya
closed
1 year ago
2
When using 13b, with the following configuration, a memory error occurs on v3-8. May I ask what is the reason?
#14
joytianya
closed
1 year ago
13
Fix load_pickle for typos.
#13
Yulv-git
closed
1 year ago
0
Previous
Next