issues
search
graphcore-research
/
unit-scaling
A library for unit scaling in PyTorch
https://graphcore-research.github.io/unit-scaling/
Apache License 2.0
102
stars
7
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
fix sgd bug and add demo notebook
#74
thecharlieblake
closed
6 days ago
0
Adam and SGD update learning rate scaling.
#73
norpadon
opened
2 weeks ago
10
Remove the constraint argument in the container modules uu.{MLP, MHSA}
#72
DouglasOrr
closed
1 month ago
0
Import Conv1d into top-level unit_scaling (fixes #69)
#71
DouglasOrr
closed
1 month ago
0
`MLP` scaling w.r.t. `expansion_factor`
#70
EIFY
opened
1 month ago
1
`Conv1d` isn't made visible
#69
EIFY
closed
1 month ago
1
[In Progress] Sharing Unit-Mamba Implementation
#68
norikazu99
opened
2 months ago
1
Conv1d
#67
thecharlieblake
closed
2 months ago
0
Sharing Unit-Conv1d Implementation
#66
norikazu99
closed
2 months ago
3
Add initial how_to_scale_op notebook
#65
DouglasOrr
closed
2 months ago
0
Migrate blog post 'almost scaled dot product attention' to the graphcore blog
#64
DouglasOrr
closed
2 months ago
0
Custom loss unit scaling
#63
norikazu99
closed
3 months ago
6
[Question] What is 'mult' parameter?
#62
mmorinag127
opened
3 months ago
4
Fix unit_scaling setup.py install (#60)
#61
DouglasOrr
closed
3 months ago
0
ModuleNotFoundError: No module named 'unit_scaling.core'
#60
alxndrTL
closed
3 months ago
2
Add a package version and unit_scaling.__version__ (0.1)
#59
DouglasOrr
closed
3 months ago
0
Updates to support u-muP, as the new default behaviour
#58
DouglasOrr
closed
3 months ago
1
Fix recursion in torch_nn_modules_to_user_modules()
#57
DouglasOrr
closed
3 months ago
0
Parametrize the number of random bits used in stochastic rounding
#56
awf
closed
4 months ago
0
Update license and numpy
#55
thecharlieblake
closed
4 months ago
0
Update to PyTorch 2.2 addendum: Fixing a doc typo and logic mismatch
#54
awf
closed
6 months ago
0
Proposal to refactor a state-carrying closure to a class
#53
awf
closed
6 months ago
0
Update to PyTorch 2.2
#52
awf
closed
6 months ago
0
Revert "Changes grad_w default scaling to be 0.75"
#51
thecharlieblake
closed
6 months ago
1
Fix torch version at 2.1 for now
#50
thecharlieblake
closed
6 months ago
0
Add dependencies and dataset download to almost_scaled_dot_product_attention blog
#49
DouglasOrr
closed
1 year ago
0
[almost-scaled blog] Introduce terms {d_seq, t}
#48
DouglasOrr
closed
1 year ago
0
PyTorch 2.1 fixes
#47
thecharlieblake
closed
1 year ago
0
Add blog draft for 'almost scaled dot product attention'
#46
DouglasOrr
closed
1 year ago
2
[Feature Request] Any plan to support fp8 type at latest torch version?
#45
MoFHeka
opened
1 year ago
5
Update limitations and add link to notebook
#44
thecharlieblake
closed
1 year ago
0
Fix torch.nn functional issue in docs
#43
thecharlieblake
closed
1 year ago
0
Tidy docs for beta release
#42
thecharlieblake
closed
1 year ago
0
Tidy plots
#41
thecharlieblake
closed
1 year ago
0
Add compile transform
#40
thecharlieblake
closed
1 year ago
0
Fix setup dependencies
#39
thecharlieblake
closed
1 year ago
0
Fix pea requirement
#38
thecharlieblake
closed
1 year ago
0
Add separate tau scaling
#37
thecharlieblake
closed
1 year ago
0
Update scaled_dot_product_attention scaling factor from our derivation.
#36
DouglasOrr
closed
1 year ago
0
Fix a torch FX error on torch-2.0.1 '__annotations__ must be set to a dict object'
#35
DouglasOrr
closed
1 year ago
0
Prepare for beta release
#34
thecharlieblake
closed
1 year ago
0
Support sdist builds
#33
DouglasOrr
closed
1 year ago
0
Visualisation tool
#32
thecharlieblake
closed
1 year ago
0
Changes grad_w default scaling to be 0.75
#31
thecharlieblake
closed
1 year ago
0
Fix setuptools packages
#30
thecharlieblake
closed
1 year ago
0
WIP, DO NOT MERGE, demo IPU training script
#29
DouglasOrr
closed
1 year ago
0
Quantisation for IPU
#28
DouglasOrr
closed
1 year ago
0
Add unit_scale transform
#27
thecharlieblake
closed
1 year ago
0
Change constraints to strings
#26
thecharlieblake
closed
1 year ago
0
Format simulation (ready for review)
#25
thecharlieblake
closed
1 year ago
0
Next