issues
search
pytorch-labs
/
float8_experimental
This repository contains the experimental PyTorch native float8 training UX
BSD 3-Clause "New" or "Revised" License
189
stars
18
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
fix nits from deletion of Float8DynamicLinear
#308
vkuzo
opened
2 hours ago
0
unify linear test cases
#307
vkuzo
opened
7 hours ago
0
static scaling support for training
#306
vkuzo
closed
2 hours ago
1
Add rowwise scaling to Float8Inference module
#305
drisspg
opened
5 days ago
0
delete Float8DynamicLinear
#304
vkuzo
closed
3 days ago
2
Add utility for filtering out skipped tests in large cross-product groups
#303
drisspg
opened
5 days ago
0
Add sanity checks to dtensor tests
#302
drisspg
closed
5 days ago
1
Thread the scaling type argument throughout fp8
#301
drisspg
opened
5 days ago
0
[9/x]: make dynamic scaling default in Float8Linear
#300
vkuzo
closed
5 days ago
2
[8/x] make single linear profiling script work with Float8 scaling type
#299
vkuzo
closed
5 days ago
2
[7/x] make profiling script support Float8Linear dynamic scaling
#298
vkuzo
closed
5 days ago
2
[6/x] switch inference tests to use Float8Linear
#297
vkuzo
closed
5 days ago
2
[5/x] make FSDP2 with float8 all-gather work for Float8Linear
#296
vkuzo
closed
5 days ago
2
Adds a test comparing the output of torch.compile and export
#295
drisspg
opened
1 week ago
0
[4/x] add tests for DTensor TP/SP + Float8Linear
#294
vkuzo
closed
5 days ago
2
[3/x]: simplify FSDP1 test and add coverage for dynamic scaling
#293
vkuzo
closed
5 days ago
2
Float8Tensor.to_original_precision() returns wrong dtype
#292
ani300
opened
1 week ago
0
[2/x]: fix numerics integration test and test delayed vs dynamic
#291
vkuzo
closed
5 days ago
2
[1/x]: Make Float8Linear support dynamic scaling
#290
vkuzo
closed
5 days ago
2
make testing better on amd
#289
drisspg
opened
2 weeks ago
0
Add a Float8LinearInference module to support static, dynamic, and wo quant
#287
drisspg
closed
1 week ago
5
[ROCm] Unskip passing torch.compile test
#286
alugorey
closed
2 weeks ago
2
Add more compile compatibility for Float8Tensor ops
#285
ani300
closed
1 week ago
10
Updates with new scaled-mm api
#284
drisspg
closed
2 weeks ago
4
Adding Float8 Linear variants supporting inference-only with lower overhead
#283
cyang49
opened
3 weeks ago
1
add norm_ffn_norm to profile script
#282
vkuzo
closed
1 week ago
2
QOL improvements to benchmarks/profile_linear_float8.py
#281
vkuzo
closed
1 week ago
2
Docs should say what's the smallest model users will see a benefit for
#280
msaroufim
opened
3 weeks ago
1
Expected trailing dimension of mat1 to be divisible by 16 but got mat1 shape
#279
msaroufim
opened
3 weeks ago
3
QOL improvements to linear benchmarking script
#278
vkuzo
closed
3 weeks ago
4
delayed scaling: stop syncing weight amax values across ranks
#277
vkuzo
closed
3 weeks ago
2
delayed scaling: delete Float8LinearMixin
#276
vkuzo
closed
3 weeks ago
2
add PrepareFloat8ModuleInput for sequence parallel
#275
wanchaol
closed
3 weeks ago
3
[QST] Dynamic Scaling
#274
jeromeku
opened
1 month ago
2
QOL improvements to linear benchmarking script
#273
vkuzo
closed
3 weeks ago
1
delayed scaling: stop syncing weight amax values across ranks
#272
vkuzo
closed
3 weeks ago
1
delayed scaling: delete Float8LinearMixin
#271
vkuzo
closed
3 weeks ago
3
[not for land] enumerate breakages with module hooks + compile
#270
vkuzo
opened
1 month ago
1
[not for land] testing ghstack 2
#269
vkuzo
closed
1 month ago
0
[not for land] testing ghstack
#268
vkuzo
closed
1 month ago
0
delayed scaling safety logic currently doesn't work with activation checkpointing
#267
vkuzo
opened
1 month ago
0
[FSDP2] pre-compute amax after optimizer.step for dynamic scaling
#266
weifengpy
opened
1 month ago
1
[FSDP2] set vocab_size=32 to avoid must be divisible by 16 error
#265
weifengpy
closed
1 month ago
2
[FSDP2] set `vocab_size=32` to avoid `must be divisible by 16` error
#264
weifengpy
closed
1 month ago
4
enable float types in pytorch for non comptue comms
#263
drisspg
closed
1 month ago
3
add wait_tensor() after all_gather in float8 to fix mem leak
#262
bdhirsh
closed
1 month ago
2
add wait_tensor() after all_gather in float8 to fix mem leak
#261
bdhirsh
closed
1 month ago
0
[not for land] standalone repro of memory leak on float8 + compile + …
#260
vkuzo
opened
1 month ago
0
memory alignment issue in torch.compile mode
#259
czmrand
opened
1 month ago
0
[wip] make all 3 gemms in float8 linear configurable
#258
vkuzo
opened
1 month ago
0
Next