Options for FP32 precsion

Flag Names

Option1: set_float32_precision + (highest, high, medium)

Pros: Clear implication of relative precision.
Cons: Potential confusion for users familiar with other data types like tf32 or bf16. Lack of direct association between dtypes and highest, high, medium.

Option2: allow_+data_type, e.g., allow_tf32. And easier to extend with newer dtypes.

Pros: More descriptive naming. Easier extension with newer data types.
Cons: Relative precision impact not immediately clear. May require additional documentation.

How/Where place flags

Option1: Only 1 flag set_fp32_precision under torch.* for all operators (conv,matmul,lstm) and all backends (CUDA, CUDNN, MKLDNN). Do not allow backend-specific flags.

Pros: Minimize configuration API surface. Simplified debugging scope.
Cons: Lack of flexibility. Inability to control precision per backend/operator.

Option2: Layering structure. All these flags should have an Optional[bool (or enum value from highest, high, medium)] type such that whenever a level is not specified, it uses the parent value

Pros: Advanced configuration per backend/operator. And can be degenerated to option 1 if all backends/operator specific flags are None.
Cons: Increased debugging complexity. Potential for multiple flags controlling the same thing. We may need to notify users when some low precision computation are enabled by some flags.

Option3: 3 flags set_fp32_conv/rnn/matmul_precision under torch.* for all backends. Allow backends to have backend-specific flags and setting backend-specific flags will also change the backend irrelevant flags.

Pros: Support per-operator configuration. Users can configure without detailed backend knowledge.
Cons: Larger searching scope. Interaction between backend-irrelevant and backend-specific flags. e.g., if user disable tf32 for cuda backend, the low precision for mkldnn matmul are also disabled.

zhuhaozhe / pytorch

Options for FP32 precsion #7

Flag Names

How/Where place flags