Closed echarlaix closed 3 months ago
Fix inference for batched inputs for fp32 model coming from min_dtype = torch.finfo(torch.float16).min
min_dtype = torch.finfo(torch.float16).min
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
Fix inference for batched inputs for fp32 model coming from
min_dtype = torch.finfo(torch.float16).min