issues
search
foundation-model-stack
/
fms-fsdp
🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash attention v2.
https://pytorch.org/docs/stable/fsdp.html
Apache License 2.0
114
stars
18
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Enhance wandb support
#46
lchu-ibm
closed
3 months ago
0
Enable torch.compile support
#45
lchu-ibm
closed
3 months ago
1
Add wandb support
#44
lchu-ibm
closed
3 months ago
0
Enforce dataset new file msg verbosity
#43
daviswer
closed
3 months ago
0
add wandb
#42
lchu-ibm
closed
3 months ago
1
add no_shard option
#41
lchu-ibm
closed
3 months ago
0
add 1.4b variant config
#40
lchu-ibm
closed
3 months ago
0
add 1.4B config
#39
lchu-ibm
closed
3 months ago
1
add comment pointing out EFA configuration
#38
nairbv
closed
3 months ago
1
optimize profiler trace generation
#37
lchu-ibm
closed
4 months ago
0
Fix weight handling for tuple case
#36
daviswer
closed
4 months ago
0
[speculator training] Speculator training
#35
daviswer
opened
4 months ago
3
extend data args parsing
#34
lchu-ibm
closed
4 months ago
0
fixing minor note
#33
raghukiran1224
closed
4 months ago
0
Update perf numbers in top section of README
#32
lchu-ibm
closed
4 months ago
0
Faulty type handling for 'weight' kwarg
#31
daviswer
closed
4 months ago
6
Updating docs
#30
raghukiran1224
closed
4 months ago
0
add FLOP counter
#29
lchu-ibm
opened
4 months ago
0
Dataset documentation and further cleanup
#28
daviswer
closed
4 months ago
1
fix dummy dataloader
#27
lchu-ibm
closed
4 months ago
1
align with FMS config
#26
lchu-ibm
closed
4 months ago
2
fix dummy dataloader for a larger simulated vocab size
#25
lchu-ibm
closed
4 months ago
0
align current llama config with FMS
#24
lchu-ibm
closed
4 months ago
0
add example scripts
#23
lchu-ibm
closed
4 months ago
0
create example scripts
#22
lchu-ibm
closed
4 months ago
0
Add doc string to the apis in the repo
#21
lchu-ibm
closed
4 months ago
0
increase nccl timeout
#20
lchu-ibm
closed
4 months ago
2
Nccl timeout
#19
lchu-ibm
closed
4 months ago
0
revert old low_cpu_mode implementation
#18
lchu-ibm
closed
4 months ago
0
setup.py to install so we can add tests and move scripts
#17
nairbv
closed
4 months ago
0
Fix recursive type definitions
#16
afrittoli
closed
4 months ago
3
RuntimeError: CUDA driver error: an illegal memory access was encountered
#15
lchu-ibm
closed
3 months ago
9
Clean up training configs
#14
lchu-ibm
closed
4 months ago
1
Change package name from pretraining to fms_fsdp
#13
lchu-ibm
closed
4 months ago
0
Pull from main
#12
daviswer
closed
4 months ago
0
add Doc
#11
lchu-ibm
closed
4 months ago
0
Fix the ibm-fms install for mypy
#10
afrittoli
closed
4 months ago
0
[Consolidated] Add Documentation
#9
lchu-ibm
closed
4 months ago
0
change package name
#8
lchu-ibm
closed
4 months ago
0
Clean up training configs
#7
lchu-ibm
closed
4 months ago
0
Revert low_cpu_fsdp implementation
#6
lchu-ibm
closed
4 months ago
2
Dataset cleanup, gnorm clipping/reporting
#5
daviswer
closed
4 months ago
11
Ci
#4
afrittoli
closed
4 months ago
0
Add lint and mypy CI jobs
#3
afrittoli
closed
4 months ago
8
move scripts to scripts folder
#2
lchu-ibm
closed
4 months ago
2
black and isort
#1
nairbv
closed
4 months ago
0
Previous