Closed jlxue closed 2 years ago
Use CUDAExecutionProvider by default to optimize FP16 models, this can avoid unintentionaly cast all fp16 inputs/weights to fp32.
Use CUDAExecutionProvider by default to optimize FP16 models, this can avoid unintentionaly cast all fp16 inputs/weights to fp32.