CSAILVision / places365

The Places365-CNNs for Scene Classification
http://places2.csail.mit.edu/
MIT License
1.92k stars 535 forks source link

AlexNet works, but other models dont? #76

Open tlupinski1 opened 4 years ago

tlupinski1 commented 4 years ago

I'm currently using a modified docker script to run video frame images through the AlexNet model. Everything works fine, i get the output i want.

However, when i try to use a different model (ResNet or VGG16), the docker image runs, instantiates the network, however just stops. This is the last thing it output (with miniloglevel = 0)

`I0303 17:38:36.301259 1 net.cpp:219] relu1_2 does not need backward computation. I0303 17:38:36.301348 1 net.cpp:219] conv1_2 does not need backward computation. I0303 17:38:36.301467 1 net.cpp:219] relu1_1 does not need backward computation. I0303 17:38:36.301512 1 net.cpp:219] conv1_1 does not need backward computation. I0303 17:38:36.301573 1 net.cpp:219] input does not need backward computation. I0303 17:38:36.301602 1 net.cpp:261] This network produces output prob I0303 17:38:36.301764 1 net.cpp:274] Network initialization done.

[libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.

[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 543027219

I0303 17:38:37.421813 1 net.cpp:752] Ignoring source layer data I0303 17:38:37.537808 1 net.cpp:752] Ignoring source layer loss `

And then exits. No output from any logging/output or error messages. Im quite confused as to how to continue.

I did get an error message before hand, saying it couldn't broadcast a image of shape (3, 227, 277) to (3, 224, 224), so i did change the resize. This only happened when using any model other than AlexNet.

Any help would be great appreciated! Thanks.

EDIT: Through some logging/debugging i now know that the application stops when i run the net.forward() command, after instantiating the network, preprocessing and loading the images into the data layer.

HuaizhengZhang commented 4 years ago

https://github.com/HuaizhengZhang/scene-recognition-pytorch1.x-tf2.x

I have upgraded the models by using pytorch1.4. You may try.