Open icyxp opened 3 months ago
Use this project to convert fp8: https://github.com/neuralmagic/AutoFP8
They are not the same type of problem. Mine is a FP8 load problem, and his is a marlin problem. @drbh
config has "activation_scheme": "dynamic":
model.layers.0.mlp.down_proj.weight < F8_E4M3
model.layers.0.mlp.down_proj.weight_scale < F32
System Info
Information
Tasks
Reproduction
none
Expected behavior
none