issues
search
huggingface
/
nanotron
Minimalistic large language model 3D-parallelism training
Apache License 2.0
1.14k
stars
107
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Update `src/nanotron/config/config.py`
#33
saforem2
closed
8 months ago
1
Question concerning Megatron-style sequence parallel support plans.
#32
veritas9872
closed
8 months ago
1
test new CI
#31
glegendre01
closed
8 months ago
0
Question: Roadmap / Feature Scope
#30
Algomancer
closed
8 months ago
1
Some sanity fix for "PR [Feature] Topology-agnostic optimizer states loading"
#29
xrsrke
closed
8 months ago
0
Some fixes for `Refactors and fixes #25`
#28
3outeille
closed
8 months ago
1
Removing slow models
#27
thomwolf
closed
8 months ago
0
Add document
#26
xrsrke
closed
8 months ago
3
Refactors and fixes
#25
NouamaneTazi
closed
8 months ago
0
Quick fixes of error messages
#24
NouamaneTazi
closed
8 months ago
0
[Feature] Topology-agnostic optimizer states loading
#23
xrsrke
closed
8 months ago
1
Fix the issue of some missing replacements related to 'dpg' and 'para…
#22
xrsrke
closed
8 months ago
0
Save checkpoint before terminating the training run
#21
xrsrke
closed
8 months ago
0
[Refactor] DistributedOptimizer and FP32GradAccum
#20
NouamaneTazi
opened
8 months ago
0
[Refactor] Add support to resume training using optimizer states with different topology
#19
NouamaneTazi
closed
8 months ago
0
[Refactor] Refactor module names / module ids usage
#18
NouamaneTazi
opened
8 months ago
0
[Refactor] Remove legacy code from serialization
#17
NouamaneTazi
closed
8 months ago
1
[Refactor] Add minimal ParallelContext
#16
xrsrke
closed
8 months ago
0
Helping making brrr depend on nanotron
#15
thomwolf
closed
8 months ago
0
Add check for model gradient in DDP
#14
NouamaneTazi
closed
8 months ago
0
[Bug] Can't do inference from a nanotron-generated checkpoint
#13
xrsrke
closed
8 months ago
1
Remove Apex dependency
#12
NouamaneTazi
closed
8 months ago
1
[Refactor] Add ParallelContext to nanotron
#11
xrsrke
closed
8 months ago
0
New APIs
#10
xrsrke
opened
8 months ago
0
Update fused-rotary-embedding branch with main
#9
3outeille
closed
9 months ago
0
Fixing testsuite
#8
3outeille
closed
9 months ago
0
[LARGE] Bring all recent updates from brrr – reducing dependencies
#7
thomwolf
closed
9 months ago
2
Fused rotary embedding
#6
3outeille
closed
8 months ago
0
Fused Layer Norm
#5
xrsrke
closed
8 months ago
2
Add time recorder
#4
xrsrke
closed
8 months ago
2
More refactoring
#3
NouamaneTazi
closed
5 months ago
0
Useful scripts
#2
NouamaneTazi
closed
9 months ago
1
Renaming 🎊
#1
NouamaneTazi
closed
12 months ago
0
Previous