facebookarchive / caffe2

Caffe2 is a lightweight, modular, and scalable deep learning framework.
https://caffe2.ai
Apache License 2.0
8.42k stars 1.95k forks source link

Segmentation Fault while running caffe2 example from docs #1364

Open edgarkhanzadian opened 6 years ago

edgarkhanzadian commented 6 years ago

I'm trying to run the example code that you have on https://caffe2.ai/docs/tutorial-loading-pre-trained-models.html But it gives Segmentation Fault 11 when I do p.run([img])

# where you installed caffe2. Probably '~/caffe2' or '~/src/caffe2'.
CAFFE2_ROOT = "/usr/local/caffe2"
# assumes being a subdirectory of caffe2
CAFFE_MODELS = "/usr/local/caffe2/python/models"
# if you have a mean file, place it in the same dir as the model

from caffe2.proto import caffe2_pb2
import numpy as np
import skimage.io
import skimage.transform
from matplotlib import pyplot
import os
from caffe2.python import core, workspace
import urllib2
print("Required modules imported.")

IMAGE_LOCATION =  "https://cdn.pixabay.com/photo/2015/02/10/21/28/flower-631765_1280.jpg"

# What model are we using? You should have already converted or downloaded one.
# format below is the model's:
# folder, INIT_NET, predict_net, mean, input image size
# you can switch the comments on MODEL to try out different model conversions
MODEL = 'squeezenet', 'init_net.pb', 'predict_net.pb', 'ilsvrc_2012_mean.npy', 227

# codes - these help decypher the output and source from a list from AlexNet's object codes to provide an result like "tabby cat" or "lemon" depending on what's in the picture you submit to the neural network.
# The list of output codes for the AlexNet models (also squeezenet)
codes =  "https://gist.githubusercontent.com/aaronmarkham/cd3a6b6ac071eca6f7b4a6e40e6038aa/raw/9edb4038a37da6b5a44c3b5bc52e448ff09bfe5b/alexnet_codes"
print "Config set!"

def crop_center(img,cropx,cropy):
    y,x,c = img.shape
    startx = x//2-(cropx//2)
    starty = y//2-(cropy//2)    
    return img[starty:starty+cropy,startx:startx+cropx]

def rescale(img, input_height, input_width):
    print("Original image shape:" + str(img.shape) + " and remember it should be in H, W, C!")
    print("Model's input shape is %dx%d") % (input_height, input_width)
    aspect = img.shape[1]/float(img.shape[0])
    print("Orginal aspect ratio: " + str(aspect))
    if(aspect>1):
        # landscape orientation - wide image
        res = int(aspect * input_height)
        imgScaled = skimage.transform.resize(img, (input_width, res))
    if(aspect<1):
        # portrait orientation - tall image
        res = int(input_width/aspect)
        imgScaled = skimage.transform.resize(img, (res, input_height))
    if(aspect == 1):
        imgScaled = skimage.transform.resize(img, (input_width, input_height))
    pyplot.figure()
    pyplot.imshow(imgScaled)
    pyplot.axis('on')
    pyplot.title('Rescaled image')
    print("New image shape:" + str(imgScaled.shape) + " in HWC")
    return imgScaled
print "Functions set."

# set paths and variables from model choice and prep image
CAFFE2_ROOT = os.path.expanduser(CAFFE2_ROOT)
CAFFE_MODELS = os.path.expanduser(CAFFE_MODELS)

# mean can be 128 or custom based on the model
# gives better results to remove the colors found in all of the training images
MEAN_FILE = os.path.join(CAFFE_MODELS, MODEL[0], MODEL[3])
if not os.path.exists(MEAN_FILE):
    mean = 128
else:
    mean = np.load(MEAN_FILE).mean(1).mean(1)
    mean = mean[:, np.newaxis, np.newaxis]
print "mean was set to: ", mean

# some models were trained with different image sizes, this helps you calibrate your image
INPUT_IMAGE_SIZE = MODEL[4]

# make sure all of the files are around...
if not os.path.exists(CAFFE2_ROOT):
    print("Houston, you may have a problem.")
INIT_NET = os.path.join(CAFFE_MODELS, MODEL[0], MODEL[1])
print 'INIT_NET = ', INIT_NET
PREDICT_NET = os.path.join(CAFFE_MODELS, MODEL[0], MODEL[2])
print 'PREDICT_NET = ', PREDICT_NET
if not os.path.exists(INIT_NET):
    print(INIT_NET + " not found!")
else:
    print "Found ", INIT_NET, "...Now looking for", PREDICT_NET
    if not os.path.exists(PREDICT_NET):
        print "Caffe model file, " + PREDICT_NET + " was not found!"
    else:
        print "All needed files found! Loading the model in the next block."

# load and transform image
img = skimage.img_as_float(skimage.io.imread(IMAGE_LOCATION)).astype(np.float32)
img = rescale(img, INPUT_IMAGE_SIZE, INPUT_IMAGE_SIZE)
img = crop_center(img, INPUT_IMAGE_SIZE, INPUT_IMAGE_SIZE)
print "After crop: " , img.shape
pyplot.figure()
pyplot.imshow(img)
pyplot.axis('on')
pyplot.title('Cropped')

# switch to CHW
img = img.swapaxes(1, 2).swapaxes(0, 1)
pyplot.figure()
for i in range(3):
    # For some reason, pyplot subplot follows Matlab's indexing
    # convention (starting with 1). Well, we'll just follow it...
    pyplot.subplot(1, 3, i+1)
    pyplot.imshow(img[i])
    pyplot.axis('off')
    pyplot.title('RGB channel %d' % (i+1))

# switch to BGR
img = img[(2, 1, 0), :, :]

# remove mean for better results
img = img * 255 - mean

# add batch size
img = img[np.newaxis, :, :, :].astype(np.float32)
print "NCHW: ", img.shape

with open(INIT_NET) as f:
    init_net = f.read()
with open(PREDICT_NET) as f:
    predict_net = f.read()

p = workspace.Predictor(init_net, predict_net)

# run the net and return prediction
results = p.run([img])

# # turn it into something we can play with and examine which is in a multi-dimensional array
# results = np.asarray(results)
# print "results shape: ", results.shape

This is the log:

WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode.
WARNING:root:Debug message: No module named caffe2_pybind11_state_gpu
Required modules imported.
Config set!
Functions set.
mean was set to:  128
INIT_NET =  /usr/local/caffe2/python/models/squeezenet/init_net.pb
PREDICT_NET =  /usr/local/caffe2/python/models/squeezenet/predict_net.pb
Found  /usr/local/caffe2/python/models/squeezenet/init_net.pb ...Now looking for /usr/local/caffe2/python/models/squeezenet/predict_net.pb
All needed files found! Loading the model in the next block.
Original image shape:(751, 1280, 3) and remember it should be in H, W, C!
Model's input shape is 227x227
Orginal aspect ratio: 1.70439414115
/usr/local/anaconda3/lib/python2.7/site-packages/skimage/transform/_warps.py:84: UserWarning: The default mode, 'constant', will be changed to 'reflect' in skimage 0.15.
  warn("The default mode, 'constant', will be changed to 'reflect' in "
New image shape:(227, 386, 3) in HWC
After crop:  (227, 227, 3)
NCHW:  (1, 3, 227, 227)
Segmentation fault: 11

What I've done before:

python -m caffe2.python.models.download -i squeezenet

My computer configs: OS: MacOS High Sierra 10.13 which python:/usr/local/anaconda3/bin/python python --version: Python 2.7.14 :: Anaconda custom (x86_64)

Thanks in advance!

Maratyszcza commented 6 years ago

Could you run the Python script under lldb and get a stack trace (thread backtrace) after the segfault?

edgarkhanzadian commented 6 years ago

@Maratyszcza thanks for your response! Sorry, but i've never used lldb in python before. Can you show me how to do that please? I'd really appreciate that !

Maratyszcza commented 6 years ago
  1. In the command line, execute lldb python to start Python under lldb debugger
  2. After lldb prompt shows up (in a few seconds), type run your-python-script.py <arguments for python script>
  3. When Python (with Caffe2 inside) hits a segmentation fault, it will not exit, but open lldb command line instead. There, type thread backtrace. lldb will print details about where the crash happened. Post them here.
  4. Type quit to close the lldb session.
edgarkhanzadian commented 6 years ago

@Maratyszcza the part where should be Segfault 11:

Process 27782 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x000000010b59017e libcaffe2.dylib`caffe2::Tensor<caffe2::CPUContext>::dim32(int) const + 30
libcaffe2.dylib`caffe2::Tensor<caffe2::CPUContext>::dim32:
->  0x10b59017e <+30>: cmpq   $0x7fffffff, (%rcx,%rax,8) ; imm = 0x7FFFFFFF 
    0x10b590186 <+38>: jge    0x10b59019a               ; <+58>
    0x10b590188 <+40>: movq   0x8(%rbx), %rcx
    0x10b59018c <+44>: movl   (%rcx,%rax,8), %eax
Target 0: (python) stopped.

after thread backtrace

(lldb) thread backtrace
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x000000010b59017e libcaffe2.dylib`caffe2::Tensor<caffe2::CPUContext>::dim32(int) const + 30
    frame #1: 0x000000010b5e0da1 libcaffe2.dylib`caffe2::ConvOp<float, caffe2::CPUContext>::RunOnDeviceWithOrderNCHW() + 113
    frame #2: 0x000000010b5c26a6 libcaffe2.dylib`caffe2::ConvPoolOpBase<caffe2::CPUContext>::RunOnDevice() + 134
    frame #3: 0x000000010b57dd01 libcaffe2.dylib`caffe2::Operator<caffe2::CPUContext>::Run(int) + 129
    frame #4: 0x000000010b54882a libcaffe2.dylib`caffe2::SimpleNet::RunAsync() + 650
    frame #5: 0x000000010b57920f libcaffe2.dylib`caffe2::Workspace::RunNet(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 79
    frame #6: 0x000000010b5667aa libcaffe2.dylib`caffe2::Predictor::run(std::__1::vector<caffe2::Tensor<caffe2::CPUContext>*, std::__1::allocator<caffe2::Tensor<caffe2::CPUContext>*> > const&, std::__1::vector<caffe2::Tensor<caffe2::CPUContext>*, std::__1::allocator<caffe2::Tensor<caffe2::CPUContext>*> >*) + 138
    frame #7: 0x000000010b47fb34 caffe2_pybind11_state.so`void pybind11::cpp_function::initialize<caffe2::python::addObjectMethods(pybind11::module&)::$_31, std::__1::vector<pybind11::object, std::__1::allocator<pybind11::object> >, caffe2::Predictor&, std::__1::vector<pybind11::object, std::__1::allocator<pybind11::object> >, pybind11::name, pybind11::sibling, pybind11::is_method>(caffe2::python::addObjectMethods(pybind11::module&)::$_31&&, std::__1::vector<pybind11::object, std::__1::allocator<pybind11::object> > (*)(caffe2::Predictor&, std::__1::vector<pybind11::object, std::__1::allocator<pybind11::object> >), pybind11::name const&, pybind11::sibling const&, pybind11::is_method const&)::'lambda'(pybind11::detail::function_record*, pybind11::handle, pybind11::handle, pybind11::handle)::__invoke(pybind11::detail::function_record*, pybind11::handle, pybind11::handle, pybind11::handle) + 340
    frame #8: 0x000000010b468065 caffe2_pybind11_state.so`pybind11::cpp_function::dispatcher(_object*, _object*, _object*) + 837
    frame #9: 0x000000010015d6a7 libpython2.7.dylib`PyEval_EvalFrameEx + 22391
    frame #10: 0x0000000100157cd4 libpython2.7.dylib`PyEval_EvalCodeEx + 2164
    frame #11: 0x0000000100157452 libpython2.7.dylib`PyEval_EvalCode + 34
    frame #12: 0x0000000100184e3d libpython2.7.dylib`PyRun_FileExFlags + 157
    frame #13: 0x0000000100184980 libpython2.7.dylib`PyRun_SimpleFileExFlags + 816
    frame #14: 0x000000010019b572 libpython2.7.dylib`Py_Main + 3506
    frame #15: 0x00007fff7f14b145 libdyld.dylib`start + 1
    frame #16: 0x00007fff7f14b145 libdyld.dylib`start + 1
(lldb) 

Thanks in advance!

apatsekin commented 6 years ago

Got the same error while trying to reproduce this example: https://caffe2.ai/docs/tutorial-loading-pre-trained-models.html

Thread 1 "python" received signal SIGSEGV, Segmentation fault.

0x00007fffc113c32b in caffe2::Tensor<caffe2::CPUContext>::dim32(int) const () from /usr/local/lib/libcaffe2.so

(gdb) where

#0  0x00007fffc113c32b in caffe2::Tensor<caffe2::CPUContext>::dim32(int) const () from /usr/local/lib/libcaffe2.so
#1  0x00007fffc12788db in caffe2::ConvOp<float, caffe2::CPUContext>::RunOnDeviceWithOrderNCHW() () from /usr/local/lib/libcaffe2.so
#2  0x00007fffc113c1f2 in caffe2::ConvPoolOpBase<caffe2::CPUContext>::RunOnDevice() () from /usr/local/lib/libcaffe2.so
#3  0x00007fffc111ba9a in caffe2::Operator<caffe2::CPUContext>::Run(int) () from /usr/local/lib/libcaffe2.so
#4  0x00007fffc11e0fe9 in caffe2::SimpleNet::DoRunAsync() () from /usr/local/lib/libcaffe2.so
#5  0x00007fffc11a7c2b in caffe2::NetBase::RunAsync() () from /usr/local/lib/libcaffe2.so
#6  0x00007fffc1166fa2 in caffe2::Workspace::RunNet(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /usr/local/lib/libcaffe2.so
#7  0x00007fffc1183170 in caffe2::Predictor::run(std::vector<caffe2::Tensor<caffe2::CPUContext>*, std::allocator<caffe2::Tensor<caffe2::CPUContext>*> > const&, std::vector<caffe2::Tensor<caffe2::CPUContext>*, std::allocator<caffe2::Tensor<caffe2::CPUContext>*> >*) () from /usr/local/lib/libcaffe2.so
#8  0x00007fffc1ad6b02 in void pybind11::cpp_function::initialize<caffe2::python::addObjectMethods(pybind11::module&)::{lambda(caffe2::Predictor&, std::vector<pybind11::object, std::allocator<pybind11::object> >)#31}, std::vector<pybind11::object, std::allocator<pybind11::object> >, caffe2::Predictor&, std::vector<pybind11::object, std::allocator<pybind11::object> >, pybind11::name, pybind11::is_method, pybind11::sibling>(caffe2::python::addObjectMethods(pybind11::module&)::{lambda(caffe2::Predictor&, std::vector<pybind11::object, std::allocator<pybind11::object> >)#31}&&, std::vector<pybind11::object, std::allocator<pybind11::object> > (*)(caffe2::Predictor&, std::vector<pybind11::object, std::allocator<pybind11::object> >), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call) ()
   from /opt/caffe2/build/caffe2/python/caffe2_pybind11_state.so
#9  0x00007fffc1b01c83 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) () from /opt/caffe2/build/caffe2/python/caffe2_pybind11_state.so
#10 0x00000000004cada2 in PyEval_EvalFrameEx ()
#11 0x00000000004c2765 in PyEval_EvalCodeEx ()
#12 0x00000000004c2509 in PyEval_EvalCode ()
#13 0x00000000004f1def in ?? ()
#14 0x00000000004ec652 in PyRun_FileExFlags ()
#15 0x00000000004eae31 in PyRun_SimpleFileExFlags ()
#16 0x000000000049e14a in Py_Main ()
#17 0x00007ffff7810830 in __libc_start_main (main=0x49dab0 <main>, argc=2, argv=0x7fffffffe5c8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
    stack_end=0x7fffffffe5b8) at ../csu/libc-start.c:291
#18 0x000000000049d9d9 in _start ()

Ubuntu 16.04.3 LTS Python 2.7.12 Caffe2 built according to instruction at official website.

keithmgould commented 6 years ago

Same issue:

keith@machine:~/Documents/learn_machine_learning/caffe2/experiments$ python squeezer.py
WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode.
WARNING:root:Debug message: No module named caffe2_pybind11_state_gpu
Required modules imported.
Config set!
Functions set.
mean was set to:  128
INIT_NET =  /Users/keith/Documents/learn_machine_learning/caffe2/caffe2/build/caffe2/python/models/squeezenet/init_net.pb
PREDICT_NET =  /Users/keith/Documents/learn_machine_learning/caffe2/caffe2/build/caffe2/python/models/squeezenet/predict_net.pb
Found  /Users/keith/Documents/learn_machine_learning/caffe2/caffe2/build/caffe2/python/models/squeezenet/init_net.pb ...Now looking for /Users/keith/Documents/learn_machine_learning/caffe2/caffe2/build/caffe2/python/models/squeezenet/predict_net.pb
All needed files found! Loading the model in the next block.
Original image shape:(751, 1280, 3) and remember it should be in H, W, C!
Model's input shape is 227x227
Orginal aspect ratio: 1.70439414115
/usr/local/lib/python2.7/site-packages/skimage/transform/_warps.py:84: UserWarning: The default mode, 'constant', will be changed to 'reflect' in skimage 0.15.
  warn("The default mode, 'constant', will be changed to 'reflect' in "
2017-11-30 15:36:17.607 Python[82945:8404154] ApplePersistenceIgnoreState: Existing state will not be touched. New state will be written to /var/folders/4c/06q896nj59n90klmblw2bmc00000gn/T/org.python.python.savedState
New image shape:(227, 386, 3) in HWC
After crop:  (227, 227, 3)
NCHW:  (1, 3, 227, 227)
Segmentation fault: 11

Mac: 10.12.5 Python: 2.7.9 Caffe2 built from source

keithmgould commented 6 years ago

Also same issue with raspian

goldsborough commented 6 years ago

A workaround is to use one of the other available pre-trained models. For example, bvlc_alexnet or blvc_googlenet work. Just download them and change the name in the MODEL tuple.

pietern commented 6 years ago

@bddppq Could this be related to the system/brew/conda Python runtime mismatch that we have seen for ONNX a few weeks ago?

bddppq commented 6 years ago

@pietern No, the stack trace shows that the segfault happens in our c++ code (more specifically ConvOp:: RunOnDeviceWithOrderNCHW), not in the Python or pybind11 level.

ANSHUMAN87 commented 6 years ago

I also got this error and resolved it. Actually the issue is with model files.

Download models from below link , and then it will work fine. https://github.com/caffe2/models/tree/master/squeezenet

Hope it will solve your problem. :)

CriCL commented 6 years ago

Hi @ANSHUMAN87 can you share your code that solved the issue?

Regards

flanaras commented 6 years ago

@CriCL I can also confirm that it works like that. I add my source code as you requested. On the top of the gist you can see my set up and how to update the model. https://gist.github.com/flanaras/97f344e7fa640dcb68607cabebe7e525

Hope this helps.

ANSHUMAN87 commented 6 years ago

Hi @CriCL , sorry for late reply. You can follow the tutorial link https://github.com/caffe2/caffe2/blob/master/caffe2/python/tutorials/Loading_Pretrained_Models.ipynb

Below is snapshot of my code:

initialize the neural net

INIT_NET = /path/to/downloaded/exec_net.pb PREDICT_NET = /path/to/downloaded/predict_net.pb IMAGE_LOCATION = /path/to/Pretzel.jpg IMAGE_SIZE = 227 MEAN=128

IMAGE_INPUT = helpers.loadToNCHW(IMAGE_LOCATION, MEAN, IMAGE_SIZE) with open(INIT_NET) as f: init_net = f.read() with open(PREDICT_NET) as f: predict_net = f.read()

p = workspace.Predictor(init_net, predict_net)

run the net and return prediction

results = p.run([IMAGE_INPUT])

NOTE: Download the pretrained models from the link i shared in previous post. Modify the global variables according to your system stored directories.

helpers.loadToNCHW() --> you can find implementation at https://github.com/caffe2/caffe2/blob/master/caffe2/python/tutorials/helpers.py

Hope it helps!!! If any other issue you face, please post, i will try help.

CriCL commented 6 years ago

Thanks @ANSHUMAN87 @flanaras .

It worked pretty well.

Do you guys have any tutorial to implement AlexNet with Python + Caffe2?

Thanks for your help, Regards