microsoft / MMdnn

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.
MIT License
5.79k stars 964 forks source link

Build torch network in docker container #922

Open egiacomazzi opened 3 years ago

egiacomazzi commented 3 years ago

Running in docker container

For the PyTorch to IR conversion the whole model (weights+structure) is needed as stated here. For that I used the following code

import torch
from network import return_net

# returns network structure with input [112,112]
net = return_net([112, 112])

# path to the saved weights of the model
path_model = "weights.pth"

torch.save(net.state_dict(), path_model)

net.load_state_dict(torch.load(path_model))

# Save whole model
# Specify a path
PATH = "entire_model.pth"
# Save
torch.save(net, PATH)

Here a statement from the PyTorch documentation

This save/load process uses the most intuitive syntax and involves the least amount of code. Saving a model in this way will save the entire module using Python’s pickle module. The disadvantage of this approach is that the serialized data is bound to the specific classes and the exact directory structure used when the model is saved. The reason for this is because pickle does not save the model class itself. Rather, it saves a path to the file containing the class, which is used during load time. Because of this, your code can break in various ways when used in other projects or after refactors.

Given that statement I understand that the build of the entire network has to take place inside the docker container (my code works outside of it). But when I try to execute it inside the container I get this error:

Traceback (most recent call last): File "load_whole_ANN.py", line 1, in import torch ImportError: No module named torch

When I try to install torch in the container via pip install torch the container responds Requirement already satisfied: torch in /usr/local/lib/python3.5/dist-packages (0.4.0) How is it possible that torch can not be found inside the container when mmdnn works with torch? Do I miss anything here? I would appreciate any help!