Closed james77777778 closed 1 week ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 69.24%. Comparing base (
f12a205
) to head (0c0a008
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Related to #19671
This PR introduces
training=False
behavior for float8-trainedDense
andEinsumDense
layers.We could eliminate
amax_history
and preprocess the weights to bypass the transpose op in compiled graph, but this would make the layers unrecoverable for further training. (We want to continue training infitting
)Perhaps the post-processing could be considered as a future plan.