facebookresearch / multipathnet

A Torch implementation of the object detection network from "A MultiPath Network for Object Detection" (https://arxiv.org/abs/1604.02135)
Other
1.34k stars 275 forks source link

Saving model crashes when training multipathnet #21

Open samson-wang opened 7 years ago

samson-wang commented 7 years ago

When executed to the following code on the end epoch of training multipathnet model, the process crashed.

   print("Saving model to "..model_path)
   torch.save(model_path, utils.checkpoint(model))

The stack trace:

/home/samson/torch/install/bin/luajit: ./modules/ModelParallelTable.lua:357: ModelParallelTable only supports CudaTensor, not torch.FloatTensor
stack traceback:
    [C]: in function 'error'
    ./modules/ModelParallelTable.lua:357: in function 'type'
    /home/samson/torch/install/share/lua/5.1/nn/utils.lua:45: in function 'recursiveType'
    /home/samson/torch/install/share/lua/5.1/nn/utils.lua:41: in function 'recursiveType'
    /home/samson/torch/install/share/lua/5.1/nn/Module.lua:126: in function 'float'
    /data/home/samson/Repo/multipathnet/utils.lua:487: in function 'checkpoint'
    train.lua:196: in function 'save'
    train.lua:340: in function 'hooks'
    ./engines/fboptimengine.lua:79: in function 'train'
    train.lua:364: in main chunk
    [C]: in function 'dofile'
    ...mson/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00405d50

FYI:

torch.save(model_path, model)

is fine.