issues
search
pytorch-labs
/
float8_experimental
This repository contains the experimental PyTorch native float8 training UX
BSD 3-Clause "New" or "Revised" License
212
stars
20
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add utility for filtering out skipped tests in large cross-product groups
#303
drisspg
opened
3 months ago
0
Add sanity checks to dtensor tests
#302
drisspg
closed
3 months ago
1
Thread the scaling type argument throughout fp8
#301
drisspg
opened
3 months ago
0
[9/x]: make dynamic scaling default in Float8Linear
#300
vkuzo
closed
3 months ago
2
[8/x] make single linear profiling script work with Float8 scaling type
#299
vkuzo
closed
3 months ago
2
[7/x] make profiling script support Float8Linear dynamic scaling
#298
vkuzo
closed
3 months ago
2
[6/x] switch inference tests to use Float8Linear
#297
vkuzo
closed
3 months ago
2
[5/x] make FSDP2 with float8 all-gather work for Float8Linear
#296
vkuzo
closed
3 months ago
2
Adds a test comparing the output of torch.compile and export
#295
drisspg
opened
3 months ago
0
[4/x] add tests for DTensor TP/SP + Float8Linear
#294
vkuzo
closed
3 months ago
2
[3/x]: simplify FSDP1 test and add coverage for dynamic scaling
#293
vkuzo
closed
3 months ago
2
Float8Tensor.to_original_precision() returns wrong dtype
#292
ani300
closed
2 months ago
1
[2/x]: fix numerics integration test and test delayed vs dynamic
#291
vkuzo
closed
3 months ago
2
[1/x]: Make Float8Linear support dynamic scaling
#290
vkuzo
closed
3 months ago
2
make testing better on amd
#289
drisspg
opened
3 months ago
0
Add a Float8LinearInference module to support static, dynamic, and wo quant
#287
drisspg
closed
3 months ago
5
[ROCm] Unskip passing torch.compile test
#286
alugorey
closed
3 months ago
2
Add more compile compatibility for Float8Tensor ops
#285
ani300
closed
3 months ago
10
Updates with new scaled-mm api
#284
drisspg
closed
3 months ago
4
Adding Float8 Linear variants supporting inference-only with lower overhead
#283
cyang49
closed
2 months ago
2
add norm_ffn_norm to profile script
#282
vkuzo
closed
3 months ago
2
QOL improvements to benchmarks/profile_linear_float8.py
#281
vkuzo
closed
3 months ago
2
Docs should say what's the smallest model users will see a benefit for
#280
msaroufim
closed
2 months ago
2
Expected trailing dimension of mat1 to be divisible by 16 but got mat1 shape
#279
msaroufim
closed
2 months ago
4
QOL improvements to linear benchmarking script
#278
vkuzo
closed
3 months ago
4
delayed scaling: stop syncing weight amax values across ranks
#277
vkuzo
closed
3 months ago
2
delayed scaling: delete Float8LinearMixin
#276
vkuzo
closed
3 months ago
2
add PrepareFloat8ModuleInput for sequence parallel
#275
wanchaol
closed
3 months ago
3
[QST] Dynamic Scaling
#274
jeromeku
closed
2 months ago
3
QOL improvements to linear benchmarking script
#273
vkuzo
closed
3 months ago
1
delayed scaling: stop syncing weight amax values across ranks
#272
vkuzo
closed
3 months ago
1
delayed scaling: delete Float8LinearMixin
#271
vkuzo
closed
3 months ago
3
[not for land] enumerate breakages with module hooks + compile
#270
vkuzo
opened
4 months ago
1
[not for land] testing ghstack 2
#269
vkuzo
closed
4 months ago
0
[not for land] testing ghstack
#268
vkuzo
closed
4 months ago
0
delayed scaling safety logic currently doesn't work with activation checkpointing
#267
vkuzo
closed
2 months ago
1
[FSDP2] precompute scale after optimizer.step for dynamic scaling
#266
weifengpy
closed
2 months ago
7
[FSDP2] set vocab_size=32 to avoid must be divisible by 16 error
#265
weifengpy
closed
4 months ago
2
[FSDP2] set `vocab_size=32` to avoid `must be divisible by 16` error
#264
weifengpy
closed
4 months ago
4
enable float types in pytorch for non comptue comms
#263
drisspg
closed
4 months ago
3
add wait_tensor() after all_gather in float8 to fix mem leak
#262
bdhirsh
closed
4 months ago
2
add wait_tensor() after all_gather in float8 to fix mem leak
#261
bdhirsh
closed
4 months ago
0
[not for land] standalone repro of memory leak on float8 + compile + …
#260
vkuzo
opened
4 months ago
0
memory alignment issue in torch.compile mode
#259
czmrand
closed
2 months ago
1
[wip] make all 3 gemms in float8 linear configurable
#258
vkuzo
closed
2 months ago
1
Float8Linear does not support autocast
#257
yitzhaklevi
closed
2 months ago
2
Add Dtensor compile test
#256
drisspg
closed
5 months ago
2
make the backward of differentiable float8 casts pass gradient as is
#255
vkuzo
closed
5 months ago
3
Better default for DelayedScalingRecipe.history_len
#254
vkuzo
closed
2 months ago
0
Enable restricted split + cat in order to enable SP
#253
drisspg
closed
5 months ago
5
Previous
Next