cnfjsss commented 5 years ago

Hi, luffic, I used your SSD model to train and output a weight file (.pth) on win7. And I want to inference in C++. But I got a problem when I converting. My converting code as follows:

cnfjsss commented 5 years ago

import torch import torchvision import torchvision.models as models import argparse import torch.nn as nn from torch.nn.parallel import DistributedDataParallel from ssd.modeling.detector import build_detection_model from ssd.config import cfg from ssd.utils.checkpoint import CheckPointer

def main(): parser = argparse.ArgumentParser(description="SSD weights file converter.") parser.add_argument( "--config-file", default="", metavar="FILE", help="path to config file", type=str, ) parser.add_argument("--ckpt", type=str, default=None, help="Trained weights.")

args = parser.parse_args()
cfg.merge_from_file(args.config_file)
cfg.freeze()

#device = torch.device("cuda")
device = torch.device("cpu")
model = build_detection_model(cfg)
model = model.to(device)
print("###########finish building model...")

state = torch.load(args.ckpt, map_location=torch.device("cpu"))
if isinstance(model, DistributedDataParallel):
    model = model.module

model.load_state_dict(state['model'], strict=True)
model.eval()

example = torch.rand(1, 3, 300, 300)
traced_script_module = torch.jit.trace(model, example, optimize=False, check_trace=False)

output = traced_script_module(example )

traced_script_module.save('./outputs/vgg_ssd300_battery/vgg_ssd300_model.pt')

=============================================================== Error occur in ssd\modeling\box_head\inference.py. Relative info as follow, I don't know how :

C:\workspace\deeplearning\SSD_battery>python modelConvert.py --config-file configs/vgg_ssd300_battery.yaml --ckpt outputs/vgg_ssd300_battery/model_011100.pth

C:\workspace\deeplearning\SSD_battery\ssd\modeling\detector\ssd_detector.py C:\workspace\deeplearning\SSD_battery\ssd\modeling\box_head\box_head.py C:\workspace\deeplearning\SSD_battery\ssd\modeling\anchors\prior_box.py:51: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect. priors = torch.tensor(priors) C:\workspace\deeplearning\SSD_battery\ssd\modeling\box_head\inference.py:19: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! for batch_id in range(batch_size): C:\workspace\deeplearning\SSD_battery\ssd\modeling\box_head\inference.py:25: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! for class_id in range(1, per_img_scores.size(1)): # skip background C:\workspace\deeplearning\SSD_battery\ssd\modeling\box_head\inference.py:29: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if scores.size(0) == 0: Traceback (most recent call last): File "modelConvert.py", line 98, in main() File "modelConvert.py", line 66, in main traced_script_module = torch.jit.trace(model, example, optimize=False, check_trace=False) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\jit__init.py", line 688, in trace var_lookup_fn, _force_outplace) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 491, in call result = self._slow_forward(*input, *kwargs) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 481, in _slow_forward result = self.forward(input, **kwargs) File "C:\workspace\deeplearning\SSD_battery\ssd\modeling\detector\ssd_detector.py", line 17, in forward detections, detector_losses = self.box_head(features, targets) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 491, in call__ result = self._slow_forward(*input, *kwargs) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 481, in _slow_forward result = self.forward(input, *kwargs) File "C:\workspace\deeplearning\SSD_battery\ssd\modeling\box_head\box_head.py", line 28, in forward return self._forward_test(cls_logits, bbox_pred) File "C:\workspace\deeplearning\SSD_battery\ssd\modeling\box_head\box_head.py", line 50, in _forward_test detections = self.post_processor(detections) File "C:\workspace\deeplearning\SSD_battery\ssd\modeling\box_head\inference.py", line 38, in call nmsed_labels = torch.tensor([class_id] keep.size(0), device=device) TypeError: mul(): argument 'other' (position 1) must be Tensor, not list

========================================================== The error code seems is: nmsed_labels = torch.tensor([class_id] * keep.size(0), device=device)

--->And I try to modify it as

nmsed_labels = torch.tensor(torch.tensor([class_id]) * keep.size(0), device=device)

retry it, got another Error, as follows:

C:\workspace\deeplearning\SSD_battery>python modelConvert.py --config-file configs/vgg_ssd300_battery.yaml --ckpt outputs/vgg_ssd300_battery/model_011100.pth

C:\workspace\deeplearning\SSD_battery\ssd\modeling\detector\ssd_detector.py C:\workspace\deeplearning\SSD_battery\ssd\modeling\box_head\box_head.py C:\workspace\deeplearning\SSD_battery\ssd\modeling\anchors\prior_box.py:51: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect. priors = torch.tensor(priors) C:\workspace\deeplearning\SSD_battery\ssd\modeling\box_head\inference.py:19: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! for batch_id in range(batch_size): C:\workspace\deeplearning\SSD_battery\ssd\modeling\box_head\inference.py:25: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! for class_id in range(1, per_img_scores.size(1)): # skip background C:\workspace\deeplearning\SSD_battery\ssd\modeling\box_head\inference.py:29: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if scores.size(0) == 0: C:\workspace\deeplearning\SSD_battery\ssd\modeling\box_head\inference.py:38: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect. nmsed_labels = torch.tensor(torch.tensor([class_id]) keep.size(0), device=device) C:\workspace\deeplearning\SSD_battery\ssd\modeling\box_head\inference.py:38: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). nmsed_labels = torch.tensor(torch.tensor([class_id]) keep.size(0), device=device) C:\workspace\deeplearning\SSD_battery\ssd\modeling\box_head\inference.py:54: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if processed_boxes.size(0) > self.cfg.TEST.MAX_PER_IMAGE > 0: C:\ProgramData\Anaconda3\lib\site-packages\torch\tensor.py:435: RuntimeWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results). 'incorrect results).', category=RuntimeWarning) Traceback (most recent call last): File "modelConvert.py", line 98, in main() File "modelConvert.py", line 66, in main traced_script_module = torch.jit.trace(model, example, optimize=False, check_trace=False) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\jit__init__.py", line 688, in trace var_lookup_fn, _force_outplace) RuntimeError: Only tensors and (possibly nested) tuples of tensors or dicts are supported as inputs or outputs of traced functions, but instead got value of type Container. Value: {'boxes': tensor([[-31.3635, -12.4522, 51.6803, 29.8587], [273.8119, -27.5646, 319.5384, 44.0863], [252.8701, -11.5197, 345.3777, 24.9226], [255.8968, -0.6861, 348.1840, 46.8716], [255.9846, 46.9156, 348.1428, 94.8391], [255.3452, 238.8537, 348.7160, 286.5271], [-37.2208, 272.6428, 46.1568, 318.6552], [255.0182, 268.9600, 350.1513, 320.3033], [-23.5073, -50.7064, 62.0234, 93.5553], [-72.4709, -13.5491, 103.8439, 44.2117], [260.7636, -50.6445, 362.5735, 77.7416], [209.0299, -14.7150, 373.6892, 51.2166], [-31.0414, 219.7984, 50.1903, 372.1953], [-75.9403, 265.7273, 120.8142, 335.7289], [229.1378, 259.6524, 388.5985, 345.4294], [258.6255, 242.1371, 346.8571, 365.5750], [ 6.6523, 76.8511, 284.6081, 247.5137], [-72.4709, -13.5491, 103.8439, 44.2117], [213.7746, -16.9424, 383.8231, 54.1286], [-13.8108, 218.5596, 51.9411, 411.9127], [ -5.0890, -14.4968, 6.0226, 0.6173], [264.6134, -62.1498, 338.0179, 110.8623], [274.1096, 251.7924, 334.7047, 406.7539]], grad_fn=), 'labels': tensor([17, 9, 12]), 'scores': tensor([0.0188, 0.0104, 0.0221, 0.0103, 0.0101, 0.0106, 0.0117, 0.0148, 0.0121, 0.0155, 0.0164, 0.0280, 0.0154, 0.0130, 0.0196, 0.0213, 0.1591, 0.0104, 0.0104, 0.0107, 0.0213, 0.0102, 0.0118], grad_fn=)}

======>I don't know how to resolve it, would you please give out a example for it? Thanks a lot.

cnfjsss commented 5 years ago

By the way, It works ok when I using your demo.py to inference.

shaoeric commented 4 years ago

By the way, It works ok when I using your demo.py to inference. i am in trouble with demo.py, because the default ckpt vgg16_reducedfc.pth from the Internet has missing keys "model", could you please tell me how to fix it or share a correct pth file?

szupzp commented 4 years ago

@cnfjsss Hi, I want to export the ScriptModule too. But i get an error when i trace the model：

torch.jit.TracingCheckError: Tracing failed sanity checks!
ERROR: Graphs differed across invocations!

I also use the demo.py to export but it doesn't work. Can you show me how to get the trace module？

szupzp commented 4 years ago

@cnfjsss Hi, I want to export the ScriptModule too. But i get an error when i trace the model：
torch.jit.TracingCheckError: Tracing failed sanity checks!
ERROR: Graphs differed across invocations!
I also use the demo.py to export but it doesn't work. Can you show me how to get the trace module？

my pytorch version is 1.4

lufficc / SSD

weight file convert to scriptModel issue #87

retry it, got another Error, as follows: