rioyokotalab / caffe2

Caffe2 is a lightweight, modular, and scalable deep learning framework.
https://caffe2.ai
Other
2 stars 0 forks source link

AttributeError: Method ImageInput is not a registered operator #1

Closed Hiroki11x closed 7 years ago

Hiroki11x commented 7 years ago
$ python resnet50_trainer.py --train_data /path-to/ilsvrc12_train_lmdb
Ignoring @/caffe2/caffe2/contrib/nccl:nccl_ops as it is not a valid file.
Ignoring @/caffe2/caffe2/contrib/gloo:gloo_ops as it is not a valid file.
Ignoring @/caffe2/caffe2/contrib/gloo:gloo_ops_gpu as it is not a valid file.
Ignoring @/caffe2/caffe2/distributed:file_store_handler_ops as it is not a valid file.
Ignoring @/caffe2/caffe2/distributed:redis_store_handler_ops as it is not a valid file.
INFO:resnet50_trainer:Running on GPUs: [0]
INFO:resnet50_trainer:Using epoch size: 1500000
INFO:data_parallel_model:Parallelizing model for devices: [0]
INFO:data_parallel_model:Create input and model training operators
INFO:data_parallel_model:Model for GPU : 0
Traceback (most recent call last):
  File "resnet50_trainer.py", line 490, in <module>
    main()
  File "resnet50_trainer.py", line 486, in main
    Train(args)
  File "resnet50_trainer.py", line 339, in Train
    optimize_gradient_memory=True,
  File "/home/hiroki11/caffe2/build/caffe2/python/data_parallel_model.py", line 24, in Parallelize_GPU
    Parallelize(*args, **kwargs)
  File "/home/hiroki11/caffe2/build/caffe2/python/data_parallel_model.py", line 142, in Parallelize
    input_builder_fun(model_helper_obj)
  File "resnet50_trainer.py", line 328, in add_image_input
    img_size=args.image_size,
  File "resnet50_trainer.py", line 61, in AddImageInput
    mirror=1
  File "/home/hiroki11/caffe2/build/caffe2/python/brew.py", line 104, in scope_wrapper
    return func(*args, **new_kwargs)
  File "/home/hiroki11/caffe2/build/caffe2/python/helpers/tools.py", line 21, in image_input
    data, label = model.net.ImageInput(
  File "/home/hiroki11/caffe2/build/caffe2/python/core.py", line 1840, in __getattr__
    ",".join(workspace.C.nearby_opnames(op_type)) + ']'
AttributeError: Method ImageInput is not a registered operator. Did you mean: []
sekiya-a commented 7 years ago
$ PYTHONPATH=/usr/local python caffe2/python/examples/resnet50_trainer.py --train_data /path/to/ilsvrc12_train_lmdb
Ignoring @/caffe2/caffe2/contrib/nccl:nccl_ops as it is not a valid file.
Ignoring @/caffe2/caffe2/contrib/gloo:gloo_ops as it is not a valid file.
Ignoring @/caffe2/caffe2/contrib/gloo:gloo_ops_gpu as it is not a valid file.
Ignoring @/caffe2/caffe2/distributed:file_store_handler_ops as it is not a valid file.
Ignoring @/caffe2/caffe2/distributed:redis_store_handler_ops as it is not a valid file.
INFO:resnet50_trainer:Running on GPUs: [0]
INFO:resnet50_trainer:Using epoch size: 1500000
Traceback (most recent call last):
  File "caffe2/python/examples/resnet50_trainer.py", line 462, in <module>
    main()
  File "caffe2/python/examples/resnet50_trainer.py", line 458, in main
    Train(args)
  File "caffe2/python/examples/resnet50_trainer.py", line 301, in Train
    data_parallel_model.Parallelize(
AttributeError: 'module' object has no attribute 'Parallelize'
$ # 自分でビルドした(runtime時にcudnn5を見に行ってたので apt-get autoremove libcudnn5 した)
$ PYTHONPATH=. python caffe2/python/examples/resnet50_trainer.py --train_data /path/to/ilsvrc12_train_lmdb
Ignoring @/caffe2/caffe2/contrib/nccl:nccl_ops as it is not a valid file.
Ignoring @/caffe2/caffe2/contrib/gloo:gloo_ops as it is not a valid file.
Ignoring @/caffe2/caffe2/contrib/gloo:gloo_ops_gpu as it is not a valid file.
Ignoring @/caffe2/caffe2/distributed:file_store_handler_ops as it is not a valid file.
Ignoring @/caffe2/caffe2/distributed:redis_store_handler_ops as it is not a valid file.
INFO:resnet50_trainer:Running on GPUs: [0]
INFO:resnet50_trainer:Using epoch size: 1500000
INFO:data_parallel_model:Parallelizing model for devices: [0]
INFO:data_parallel_model:Create input and model training operators
INFO:data_parallel_model:Model for GPU : 0
INFO:data_parallel_model:Adding gradient operators
INFO:data_parallel_model:Add gradient all-reduces for SyncSGD
INFO:data_parallel_model:Post-iteration operators for updating params
INFO:data_parallel_model:Calling optimizer builder function
INFO:data_parallel_model:Add initial parameter sync
WARNING:data_parallel_model:------- DEPRECATED API, please use data_parallel_model.OptimizeGradientMemory() -----
WARNING:memonger:NOTE: Executing memonger to optimize gradient memory
INFO:memonger:Remapping 111 blobs, using 14 shared
INFO:memonger:Memonger memory optimization took 0.0161368846893 secs
INFO:resnet50_trainer:Starting epoch 0/1000
INFO:resnet50_trainer:Finished iteration 1/46875 of epoch 0 (25.48 images/sec)
INFO:resnet50_trainer:Training loss: 7.22864294052, accuracy: 0.0
INFO:resnet50_trainer:Finished iteration 2/46875 of epoch 0 (107.79 images/sec)
INFO:resnet50_trainer:Training loss: 21.5477619171, accuracy: 0.0
INFO:resnet50_trainer:Finished iteration 3/46875 of epoch 0 (112.01 images/sec)
INFO:resnet50_trainer:Training loss: 17.5498409271, accuracy: 0.0
INFO:resnet50_trainer:Finished iteration 4/46875 of epoch 0 (112.40 images/sec)
INFO:resnet50_trainer:Training loss: 25.3153591156, accuracy: 0.0
(略)
$ nvidia-docker run -it --rm -v /path/to/ilsvrc12_train_lmdb:/data caffe2ai/caffe2 python caffe2/python/examples/resnet50_trainer.py --train_data /data
Ignoring @/caffe2/caffe2/contrib/nccl:nccl_ops as it is not a valid file.
Ignoring @/caffe2/caffe2/contrib/gloo:gloo_ops as it is not a valid file.
Ignoring @/caffe2/caffe2/contrib/gloo:gloo_ops_gpu as it is not a valid file.
Ignoring @/caffe2/caffe2/distributed:file_store_handler_ops as it is not a valid file.
Ignoring @/caffe2/caffe2/distributed:redis_store_handler_ops as it is not a valid file.
INFO:resnet50_trainer:Running on GPUs: [0]
INFO:resnet50_trainer:Using epoch size: 1500000
INFO:data_parallel_model:Parallelizing model for devices: [0]
INFO:data_parallel_model:Create input and model training operators
INFO:data_parallel_model:Model for GPU: 0
INFO:data_parallel_model:Adding gradient operators
INFO:data_parallel_model:Add gradient all-reduces for SyncSGD
INFO:data_parallel_model:Post-iteration operators for updating params
INFO:data_parallel_model:Add initial parameter sync
WARNING:memonger:NOTE: Executing memonger to optimize gradient memory
INFO:memonger:Remapping 128 blobs, using 19 shared
INFO:memonger:Memonger optimization took 0.00878810882568 secs
INFO:resnet50_trainer:Starting epoch 0/1000
INFO:resnet50_trainer:Finished iteration 1/46875 of epoch 0 (7.29 images/sec)
INFO:resnet50_trainer:Finished iteration 2/46875 of epoch 0 (119.08 images/sec)
INFO:resnet50_trainer:Finished iteration 3/46875 of epoch 0 (120.80 images/sec)
INFO:resnet50_trainer:Finished iteration 4/46875 of epoch 0 (121.48 images/sec)
(略)

Dockerのはv0.6.0なので少し古いです

sekiya-a commented 7 years ago

最初はprotobufのバージョンのせいかと思ったのですが、違ったのでよくわからないです

Hiroki11x commented 7 years ago

@sekiya-a 解凍していただいてたんですね、気づかずすみません。 最新にupdateしたらbuildできました