Closed Phaired closed 2 months ago
RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor
This error means that one of the tensors is on the CPU and the other is on the GPU. In this case the dummy model input used during export is on the CPU but the model weights are on the GPU. I think when I've exported models in the past, I always did so on a device with only a CPU. So this is a bug in the training script for not taking into account the possibility that you're doing the export on a system with a GPU (!)
In torch.onnx.export
calls, both the model
and the dummy input need to be on the same device. The model
is on the GPU here, so to move the dummy input to the GPU this would involve a change in train_detection.py
from:
test_batch = next(iter(val_dataloader))
test_image = test_batch["image"][0:1]
torch.onnx.export(
To:
test_batch = next(iter(val_dataloader))
test_image = test_batch["image"][0:1].to(device)
torch.onnx.export(
The other option would be to find the line that initializes device
and make it use the CPU instead.
Ok, well it fixed the issue, but when trying to use the custom detection model with the CLI (for testing purposes), I got an error:
$ ocrs --detect-model text-detection.rten imga.png
Error: Failed to load text detection model from text-detection.rten
Caused by:
parse error: Type `i32` at position 1313166418 is unaligned.
Can you upload the ONNX model somewhere? Also can you confirm which version of ocrs
you have installed (ocrs --version
) and which version of rten-convert
you used (pip show rten-convert
)?
Here is the ONNX model: text-detection.zip.
$ ocrs --version
ocrs 0.8.0
As for rten-convert
, I can't provide the exact version because I installed it recently on the pod used for training(which is now deleted), so it's likely the latest version.
Ah, I can see the problem. The .rten
file format was changed recently to support larger models, but the published version of ocrs
uses an older version of the rten library that does not recognize the latest format. The workaround is to pass the --v1
flag when running rten-convert
to force use of the older file format:
rten-convert --v1 text-detection.onnx
When I publish the next release of ocrs
it will support the latest (V2) .rten
model format.
Hey, I generated a synthetic dataset for my needs and trained the models. I wanted to export it to ONNX and then to RTEN, but it seems like I'm having trouble converting it to ONNX. Am I missing something?
running on an A100 CUDA 12.2 Driver 535.154.05