issues
search
epfLLM
/
Megatron-LLM
distributed trainer for LLMs
Other
529
stars
76
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Better documentation
#57
AleHD
closed
1 year ago
1
Llama v1 import from HF support
#56
AleHD
closed
1 year ago
3
Metrics support
#55
AleHD
closed
1 year ago
1
Prepend bos token
#54
panx27
closed
12 months ago
1
Make llama2 vocab size divisible by 128 by default
#53
AleHD
closed
1 year ago
1
dose 8 A100 80g enough to finetune 70b llama2 ?
#52
james2v
closed
1 year ago
5
Add CodeLlama support
#51
andreaskoepf
closed
1 year ago
6
llama2 & vocabulary padding (making embedding layer sizes divisible by 128)
#50
andreaskoepf
closed
1 year ago
1
convert huggingface model to megatron. "Only llama v2 available using huggingface"
#49
uygnef
closed
1 year ago
1
Update megatron2hf.py
#48
AleHD
closed
1 year ago
0
Set max_position_embeddings to args.seq_length in LlamaConfig
#47
andreaskoepf
closed
1 year ago
0
added support to override special tokens when converting to huggingface
#46
AleHD
closed
1 year ago
0
Fix GQA handling in convert_wqkv
#45
andreaskoepf
closed
1 year ago
0
Fix merge order in merge_meta_llama()
#44
andreaskoepf
closed
1 year ago
0
Convert LLama-30B to Megatron Error
#43
dumpmemory
closed
1 year ago
1
Update weights2megatron.py
#42
dumpmemory
closed
1 year ago
3
llama2 70B weights2megatron OOM fix
#41
andreaskoepf
closed
1 year ago
1
Instruction tuning
#40
AleHD
closed
1 year ago
0
Add update_to_hub docs
#39
AleHD
closed
1 year ago
0
Fix minor typos in push_to_hub.py
#38
andreaskoepf
closed
1 year ago
0
Add model export utility push_to_hub.py
#37
andreaskoepf
closed
1 year ago
0
Add Megatron to Huggingface conversion for Falcon models
#36
andreaskoepf
closed
1 year ago
0
Fixing invalid name in Falcon's megatron weights
#35
malteos
closed
1 year ago
4
Improve NaN detection by checking `grad_norm`
#34
andreaskoepf
closed
1 year ago
0
NaN detection possibly ineffective
#33
andreaskoepf
closed
1 year ago
0
how to convert baichuan-13b into megatron weights?
#32
wwngh1233
closed
1 year ago
3
OpenAssistant training changes [not intended for merging]
#31
andreaskoepf
closed
1 year ago
2
Nice-to-have training features
#30
andreaskoepf
opened
1 year ago
0
more appropriate --chunk_size for tools/preprocess_data.py
#29
panx27
closed
1 year ago
1
Add falcon support in megatron2hf.py
#28
AleHD
closed
1 year ago
4
Weight conversion testing and other features
#27
AleHD
closed
1 year ago
1
Add linear RoPE scaling & arbitary position_ids
#26
andreaskoepf
closed
1 year ago
1
Update convert_llama2hf.py with latest version from HF transformers
#25
andreaskoepf
closed
1 year ago
0
add GQA(MQA) support in megatron2hf conversion
#24
Olivia-fsm
closed
1 year ago
0
Passed position_ids are ignored for `PositionEmbeddingType.rotary`
#23
andreaskoepf
closed
1 year ago
1
iteration-time increases linearly (for TP=2, PP=1 & TP=1, PP=2)
#22
andreaskoepf
closed
1 year ago
8
Add LIMA dropout
#21
andreaskoepf
closed
1 year ago
0
Add llama2 to usage help string of weights2megatron.sh
#20
andreaskoepf
closed
1 year ago
0
Generate HuggingFace tokenizer configuration as part of megatron2hf.py (weight conversion)
#19
andreaskoepf
closed
1 year ago
2
cuda misaligned address in pretrain llama2 7B
#18
pwq1989
closed
1 year ago
2
convert_llama2hf.py should be replaced with newer version
#17
andreaskoepf
closed
1 year ago
3
Fix wandb logging of validation metrics
#16
andreaskoepf
closed
1 year ago
1
Add llama2 to usage help string
#15
andreaskoepf
closed
1 year ago
2
Error during merge of sharded checkpoint: 'TransformerLanguageModel' object has no attribute 'lm_head'
#14
andreaskoepf
closed
1 year ago
1
Documentation
#13
AleHD
closed
1 year ago
0
Documentation
#12
AleHD
closed
1 year ago
0
The training speed is two times slower than the Megatron-LM and Megatron-Deepspeed
#11
zhao1iang
closed
1 year ago
5
HF LLaMA -> megatron weight
#10
dumpmemory
closed
1 year ago
5
Validation metrics are not logged to wandb
#9
andreaskoepf
closed
1 year ago
1
Can you send me the complete parameters related to training llama2 using finetune. py?
#8
brewswang
closed
1 year ago
1
Previous
Next