graphcore-research unit-scaling issues

graphcore-research / unit-scaling

A library for unit scaling in PyTorch

https://graphcore-research.github.io/unit-scaling/

Apache License 2.0

102 stars 7 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

fix sgd bug and add demo notebook

#74 thecharlieblake closed 6 days ago
0
Adam and SGD update learning rate scaling.

#73 norpadon opened 2 weeks ago
10
Remove the constraint argument in the container modules uu.{MLP, MHSA}

#72 DouglasOrr closed 1 month ago
0
Import Conv1d into top-level unit_scaling (fixes #69)

#71 DouglasOrr closed 1 month ago
0
`MLP` scaling w.r.t. `expansion_factor`

#70 EIFY opened 1 month ago
1
`Conv1d` isn't made visible

#69 EIFY closed 1 month ago
1
[In Progress] Sharing Unit-Mamba Implementation

#68 norikazu99 opened 2 months ago
1
Conv1d

#67 thecharlieblake closed 2 months ago
0
Sharing Unit-Conv1d Implementation

#66 norikazu99 closed 2 months ago
3
Add initial how_to_scale_op notebook

#65 DouglasOrr closed 2 months ago
0
Migrate blog post 'almost scaled dot product attention' to the graphcore blog

#64 DouglasOrr closed 2 months ago
0
Custom loss unit scaling

#63 norikazu99 closed 3 months ago
6
[Question] What is 'mult' parameter?

#62 mmorinag127 opened 3 months ago
4
Fix unit_scaling setup.py install (#60)

#61 DouglasOrr closed 3 months ago
0
ModuleNotFoundError: No module named 'unit_scaling.core'

#60 alxndrTL closed 3 months ago
2
Add a package version and unit_scaling.__version__ (0.1)

#59 DouglasOrr closed 3 months ago
0
Updates to support u-muP, as the new default behaviour

#58 DouglasOrr closed 3 months ago
1
Fix recursion in torch_nn_modules_to_user_modules()

#57 DouglasOrr closed 3 months ago
0
Parametrize the number of random bits used in stochastic rounding

#56 awf closed 4 months ago
0
Update license and numpy

#55 thecharlieblake closed 4 months ago
0
Update to PyTorch 2.2 addendum: Fixing a doc typo and logic mismatch

#54 awf closed 6 months ago
0
Proposal to refactor a state-carrying closure to a class

#53 awf closed 6 months ago
0
Update to PyTorch 2.2

#52 awf closed 6 months ago
0
Revert "Changes grad_w default scaling to be 0.75"

#51 thecharlieblake closed 6 months ago
1
Fix torch version at 2.1 for now

#50 thecharlieblake closed 6 months ago
0
Add dependencies and dataset download to almost_scaled_dot_product_attention blog

#49 DouglasOrr closed 1 year ago
0
[almost-scaled blog] Introduce terms {d_seq, t}

#48 DouglasOrr closed 1 year ago
0
PyTorch 2.1 fixes

#47 thecharlieblake closed 1 year ago
0
Add blog draft for 'almost scaled dot product attention'

#46 DouglasOrr closed 1 year ago
2
[Feature Request] Any plan to support fp8 type at latest torch version?

#45 MoFHeka opened 1 year ago
5
Update limitations and add link to notebook

#44 thecharlieblake closed 1 year ago
0
Fix torch.nn functional issue in docs

#43 thecharlieblake closed 1 year ago
0
Tidy docs for beta release

#42 thecharlieblake closed 1 year ago
0
Tidy plots

#41 thecharlieblake closed 1 year ago
0
Add compile transform

#40 thecharlieblake closed 1 year ago
0
Fix setup dependencies

#39 thecharlieblake closed 1 year ago
0
Fix pea requirement

#38 thecharlieblake closed 1 year ago
0
Add separate tau scaling

#37 thecharlieblake closed 1 year ago
0
Update scaled_dot_product_attention scaling factor from our derivation.

#36 DouglasOrr closed 1 year ago
0
Fix a torch FX error on torch-2.0.1 '__annotations__ must be set to a dict object'

#35 DouglasOrr closed 1 year ago
0
Prepare for beta release

#34 thecharlieblake closed 1 year ago
0
Support sdist builds

#33 DouglasOrr closed 1 year ago
0
Visualisation tool

#32 thecharlieblake closed 1 year ago
0
Changes grad_w default scaling to be 0.75

#31 thecharlieblake closed 1 year ago
0
Fix setuptools packages

#30 thecharlieblake closed 1 year ago
0
WIP, DO NOT MERGE, demo IPU training script

#29 DouglasOrr closed 1 year ago
0
Quantisation for IPU

#28 DouglasOrr closed 1 year ago
0
Add unit_scale transform

#27 thecharlieblake closed 1 year ago
0
Change constraints to strings

#26 thecharlieblake closed 1 year ago
0
Format simulation (ready for review)

#25 thecharlieblake closed 1 year ago
0