philkr / lpo

Implementation of the CVPR 2015 paper: Learning to propose objects
90 stars 46 forks source link

Trying to figure out the output... #18

Closed ghost closed 8 years ago

ghost commented 9 years ago

Hey,

I've been trying to get a small bounding-box example working, based on your propose_hf5.py code.

Here is my code:

imgs = [ lpo.imgproc.imread('cat.jpg') ]

prop = lpo.proposals.LPO()
prop.load( 'dats/lpo_VOC_0.02.dat' )

detector = lpo.contour.MultiScaleStructuredForest()
detector.load( 'dats/sf.dat' )

over_segs = lpo.segmentation.generateGeodesicKMeans( detector, imgs, 1000 )

props = prop.propose( over_segs, 0.01, True )

props = props[0][0]

fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.imshow(imgs[0])
for bb in props.toBoxes():
    ax.add_patch(matplotlib.patches.Rectangle((bb[0],bb[1]),bb[2],bb[3], color='red', fill=False))

I end up with: catboxes

If I play around with some of the parameters, I end up getting an enormous amount of proposal boxes.

I was hoping somebody could provide some advice to help me get this working.

Mona77 commented 9 years ago

Hi,

I have been able to use the above code to display the bounding boxes. Could you please share the detail on how to display the segments on the image (similar to figure 3 in your paper)?

Thanks in advance!

philkr commented 9 years ago

prop.propose returns a vector of Proposals objects, computed by the different models. Each Proposals object contains a superpixel segmentation Proposals.s (2d image of integers) and binary masks describing the proposals Proposals.p (on a superpixel level). To get pixelwise proposal masks simply call Proposals.p[Proposals.s] from python. Note however that the resulting array might be quite large...

ghost commented 9 years ago

@philkr thanks for the reply! Just looking to better understand something, using the same image and code as above...

props = prop.propose( over_segs, 0.01, True )

props is a lpo.proposals.VecVecProposals object. I'll assume the first index is for each image passed into the lpo.segmentation.generateGeodesicKMeans function. So for each image, as you say, we have a vector of Proposals objects. Looking at each of these objects:

for i in props[0]:
    print i.s.shape, i.p.shape

... output:
(360, 480) (30, 1020)
(360, 480) (60, 260)
(360, 480) (22, 106)
(360, 480) (3, 45)
(360, 480) (1, 25)
(360, 480) (0, 8)
(360, 480) (20, 181)
(360, 480) (2, 45)
(360, 480) (0, 21)

These shapes do not match, so trying to do a Proposals.p[Proposals.s] operation is giving me an index out-of-bounds error. If it's worth anything, the original image is 360 x 480.

philkr commented 9 years ago

That is my bad. It should be Proposals.p[:,Proposals.s].

migimimi commented 9 years ago

Hello, sorry to disturb the conversation, but I have a question to ask. I tried to run propose_hf5.py or Mr paulinder's code on Ubuntu, but failed for this error:

:AssertionError: Assertion "n == PROP_MAGIC" failed in /root/lpo/lib/proposals/lpo.cpp:61

if you all have any suggestion, any help would be appreciated. Thanks in advance.

ghost commented 9 years ago

@philkr Apologies for the non-response. Thanks for the clarification - it was helpful!

@migimimi I did not run into that issue. Upon looking at the code, I'm not certain what PROP_MAGIC represents:

static const int PROP_MAGIC = 0x960902;

void LPO::load( std::istream & is ) {
    int n = 0;
    is.read( (char*)&n, sizeof(n) );
    eassert( n == PROP_MAGIC );
    n=0;
    is.read( (char*)&n, sizeof(n) );
    models_.clear();
    for( int i=0; i<n; i++ )
        models_.push_back( loadLPOModel(is) );

My complete guess is that PROP_MAGIC represents something relevant to a 64-bit processor, and you're trying to run it on a 32-bit processor? Or, for some case where your sizeof(n) would provide a different result than what the above was coded for.

pradeepj247 commented 8 years ago

@philkr @paulinder

i tried using the simplified code snippet provided in this thread. by @paulinder, and I was able to get a list region proposals as when i do this.

for i in props[0]:
       print i.s.shape, i.p.shape

I actually have 2 questions:

  1. it took some time to load the trained model and then ran a k-means and gave me the output and the whole thing took about 5 secs. is this normal ? will it speed up if I have more images
  2. in this thread, on an earlier post, @paulinder mentions that the shapes do not match. with this code code snippet. how do i interpret the box proposals i have got? what do i get with Proposals.p[:,Proposals.s] exactly?
ghost commented 8 years ago

Just answering this from memory....

To answer the first question 1 - that sounds fairly normal. Try timing the single image proposal time, without loading the model files... I think it should take between 1-2 seconds for a single image.

To answer the second question 1 -

# Iterate over the Proposals objects
# Index [0] is used here because, in my example, only one image was processed.
for prop in props[0]:
    # prop is a Proposals object
    # segmentation is a "mask" that should be the same size as the input image, i think
    #     it contains bool values for each pixel, indicating whether they are part of the object
    #     proposal or not. use this mask to select a single object proposal 
    #     e.g. segmented_object = image[segmentation_mask]
    segmentation_mask = prop.p[: prop.s]    
philkr commented 8 years ago

a small addition about the timing issues: make sure you build the project in release mode ( CMAKE_BUILD_TYPE=Release ), to use all compiler optimizations. On Nov 22, 2015 9:50 AM, "Paul Inder" notifications@github.com wrote:

Just answering this from memory....

To answer the first question 1 - that sounds fairly normal. Try timing the single image proposal time, without loading the model files... I think it should take between 1-2 seconds for a single image.

To answer the second question 1 -

Iterate over the Proposals objects

Index [0] is used here because, in my example, only one image was processed.

for prop in props[0]:

prop is a Proposals object

# segmentation is a "mask" that should be the same size as the input image, i think
#     it contains bool values for each pixel, indicating whether they are part of the object
#     proposal or not. use this mask to select a single object proposal
#     e.g. segmented_object = image[segmentation_mask]
segmentation_mask = prop.p[: prop.s]

— Reply to this email directly or view it on GitHub https://github.com/philkr/lpo/issues/18#issuecomment-158781072.

pradeepj247 commented 8 years ago

Thanks @paulinder and @philkr :+1:

The overall objective is to generate a trainval.mat type file that can be used in fast-rCNN.

Is there any standard example or function to generate that in this package somewhere ?