pytorch-labs float8_experimental issues

pytorch-labs / float8_experimental

This repository contains the experimental PyTorch native float8 training UX

BSD 3-Clause "New" or "Revised" License

212 stars 20 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Add utility for filtering out skipped tests in large cross-product groups

#303 drisspg opened 3 months ago
0
Add sanity checks to dtensor tests

#302 drisspg closed 3 months ago
1
Thread the scaling type argument throughout fp8

#301 drisspg opened 3 months ago
0
[9/x]: make dynamic scaling default in Float8Linear

#300 vkuzo closed 3 months ago
2
[8/x] make single linear profiling script work with Float8 scaling type

#299 vkuzo closed 3 months ago
2
[7/x] make profiling script support Float8Linear dynamic scaling

#298 vkuzo closed 3 months ago
2
[6/x] switch inference tests to use Float8Linear

#297 vkuzo closed 3 months ago
2
[5/x] make FSDP2 with float8 all-gather work for Float8Linear

#296 vkuzo closed 3 months ago
2
Adds a test comparing the output of torch.compile and export

#295 drisspg opened 3 months ago
0
[4/x] add tests for DTensor TP/SP + Float8Linear

#294 vkuzo closed 3 months ago
2
[3/x]: simplify FSDP1 test and add coverage for dynamic scaling

#293 vkuzo closed 3 months ago
2
Float8Tensor.to_original_precision() returns wrong dtype

#292 ani300 closed 2 months ago
1
[2/x]: fix numerics integration test and test delayed vs dynamic

#291 vkuzo closed 3 months ago
2
[1/x]: Make Float8Linear support dynamic scaling

#290 vkuzo closed 3 months ago
2
make testing better on amd

#289 drisspg opened 3 months ago
0
Add a Float8LinearInference module to support static, dynamic, and wo quant

#287 drisspg closed 3 months ago
5
[ROCm] Unskip passing torch.compile test

#286 alugorey closed 3 months ago
2
Add more compile compatibility for Float8Tensor ops

#285 ani300 closed 3 months ago
10
Updates with new scaled-mm api

#284 drisspg closed 3 months ago
4
Adding Float8 Linear variants supporting inference-only with lower overhead

#283 cyang49 closed 2 months ago
2
add norm_ffn_norm to profile script

#282 vkuzo closed 3 months ago
2
QOL improvements to benchmarks/profile_linear_float8.py

#281 vkuzo closed 3 months ago
2
Docs should say what's the smallest model users will see a benefit for

#280 msaroufim closed 2 months ago
2
Expected trailing dimension of mat1 to be divisible by 16 but got mat1 shape

#279 msaroufim closed 2 months ago
4
QOL improvements to linear benchmarking script

#278 vkuzo closed 3 months ago
4
delayed scaling: stop syncing weight amax values across ranks

#277 vkuzo closed 3 months ago
2
delayed scaling: delete Float8LinearMixin

#276 vkuzo closed 3 months ago
2
add PrepareFloat8ModuleInput for sequence parallel

#275 wanchaol closed 3 months ago
3
[QST] Dynamic Scaling

#274 jeromeku closed 2 months ago
3
QOL improvements to linear benchmarking script

#273 vkuzo closed 3 months ago
1
delayed scaling: stop syncing weight amax values across ranks

#272 vkuzo closed 3 months ago
1
delayed scaling: delete Float8LinearMixin

#271 vkuzo closed 3 months ago
3
[not for land] enumerate breakages with module hooks + compile

#270 vkuzo opened 4 months ago
1
[not for land] testing ghstack 2

#269 vkuzo closed 4 months ago
0
[not for land] testing ghstack

#268 vkuzo closed 4 months ago
0
delayed scaling safety logic currently doesn't work with activation checkpointing

#267 vkuzo closed 2 months ago
1
[FSDP2] precompute scale after optimizer.step for dynamic scaling

#266 weifengpy closed 2 months ago
7
[FSDP2] set vocab_size=32 to avoid must be divisible by 16 error

#265 weifengpy closed 4 months ago
2
[FSDP2] set `vocab_size=32` to avoid `must be divisible by 16` error

#264 weifengpy closed 4 months ago
4
enable float types in pytorch for non comptue comms

#263 drisspg closed 4 months ago
3
add wait_tensor() after all_gather in float8 to fix mem leak

#262 bdhirsh closed 4 months ago
2
add wait_tensor() after all_gather in float8 to fix mem leak

#261 bdhirsh closed 4 months ago
0
[not for land] standalone repro of memory leak on float8 + compile + …

#260 vkuzo opened 4 months ago
0
memory alignment issue in torch.compile mode

#259 czmrand closed 2 months ago
1
[wip] make all 3 gemms in float8 linear configurable

#258 vkuzo closed 2 months ago
1
Float8Linear does not support autocast

#257 yitzhaklevi closed 2 months ago
2
Add Dtensor compile test

#256 drisspg closed 5 months ago
2
make the backward of differentiable float8 casts pass gradient as is

#255 vkuzo closed 5 months ago
3
Better default for DelayedScalingRecipe.history_len

#254 vkuzo closed 2 months ago
0
Enable restricted split + cat in order to enable SP

#253 drisspg closed 5 months ago
5

Previous Next