DHOFM commented 7 years ago

Hello, thanks for the great code, but i have a problem with it, testing other models than the grocery model. It is the same problem with the similar code published here: https://github.com/Microsoft/CNTK/blob/master/Examples/Image/Detection/FastRCNN/CNTK_FastRCNN_Eval.ipynb

If i load the Pascal Model from the original tutorial (pkranen) and ealuate a pascal image or use a self trained model from the pascal dataset, your code does not detect anything. In the detector no rois_labels_predictions are set, i guess that this is the problem. The pretrained model i tested against is: https://www.cntk.ai/Models/FRCN_Pascal/Fast-RCNN.model Do you have any suggestions on this ?

Kind regards,

Dirk

DHOFM commented 7 years ago

Some additonail information: Debugging into applyNonMaximaSuppression(nmsThreshold, labels, scores, coords) np.count_nonzero(labels) the count of non zero labels np.count_nonzero(labels) is 54 in the grocery case and 0 in the pascal case

nadavbar commented 7 years ago

Hi @DHOFM, can you please provide an example image that I can run the test on in order to reproduce the issue? I want to understand if the problem is in the code or in the model. Might be the issue is with the thresholds and we need to find a way to re-adjust them.

Thanks, Nadav

DHOFM commented 7 years ago

@nadavbar Thanks - i tested it with the pascal images from pkranes tutorial, the vocdevkit/Voc2007 for example this one:

000008

DHOFM commented 7 years ago

what i have seen also debugging, is that after: output = self.__model.eval(arguments) all max values found are in index 0 like: [ 15.3869276 5.46492958 -5.02364111 3.75026011 1.86460972 -2.08186817 -4.47750282 0.64720851 -5.42951918 3.03713846 -2.08754921 -0.16455238 -6.09574175 -5.07219219 -5.40605736 -5.4328661 -5.57936049 -1.28829587 -0.47886321 0.0 ... [ 15.57128525 4.56511927 -4.08860397 1.41662586 1.61265123 0.95950609 -3.28608513 -0.03665407 -6.44404221 1.94534063 -3.41620255 1.13755178 -5.05313683 -3.06347942 -4.65259886 -4.36147833 -4.06089687 -1.76999986 -1.57146466 -2.2

and so on...

DHOFM commented 7 years ago

@nadavbar Good news :) i think the problem is solved now, by taking the right parameters for the pascal model. Sometimes it is impossible to see the forest for the trees :)

So here is old and new parameters, i got some results and will do some testing now: constants used for ROI generation old :

ROI generation

roi_minDimRel = 0.04

roi_maxDimRel = 0.4

roi_minNrPixelsRel = 2 roi_minDimRel roi_minDimRel

roi_maxNrPixelsRel = 0.33 roi_maxDimRel roi_maxDimRel

roi_maxAspectRatio = 4.0 # maximum aspect Ratio of a ROI vertically and horizontally

r#oi_maxImgDim = 200 # image size used for ROI generation

ss_scale = 100 # selective search ROIS: parameter controlling cluster size for segmentation

ss_sigma = 1.2 # selective search ROIs: width of gaussian kernal for segmentation

ss_minSize = 20 # selective search ROIs: minimum component size for segmentation

grid_nrScales = 7 # uniform grid ROIs: number of iterations from largest possible ROI to smaller ROIs

grid_aspectRatios = [1.0, 2.0, 0.5] # uniform grid ROIs: aspect ratio of ROIs

new

ROI generation

roi_minDimRel = 0.01 # minium relative width/height of a ROI roi_maxDimRel = 1.0 # maximum relative width/height of a ROI roi_minNrPixelsRel = 0 # minium relative area covered by ROI roi_maxNrPixelsRel = 1.0 # maximm relative area covered by ROI roi_maxAspectRatio = 4.0 # maximum aspect Ratio of a ROI vertically and horizontally roi_maxImgDim = 200 # image size used for ROI generation ss_scale = 100 # selective search ROIS: parameter controlling cluster size for segmentation ss_sigma = 1.2 # selective search ROIs: width of gaussian kernal for segmentation ss_minSize = 20 # selective search ROIs: minimum component size for segmentation grid_nrScales = 7 # uniform grid ROIs: number of iterations from largest possible ROI to smaller ROIs grid_aspectRatios = [1.0, 2.0, 0.5] # uniform grid ROIs: aspect ratio of ROIs

new

roi_minDim = roi_minDimRel roi_maxImgDim roi_maxDim = roi_maxDimRel roi_maxImgDim roi_minNrPixels = roi_minNrPixelsRel roi_maxImgDim roi_maxImgDim roi_maxNrPixels = roi_maxNrPixelsRel roi_maxImgDim roi_maxImgDim

nms_threshold = 0.1 <==old

nms_threshold = 0.3

Kind regards,

Dirk

DHOFM commented 7 years ago

So now i have a new problem: Evaluating the same image in pkranens tutorial and your code, finds less objects in your code (see example image). I debugged into it and found out that both outputs after evaluating the same model differ:

Your code: max =47.8608 min=-23.5214 Original: max=37.6939 min=-13.6827 Shape and size are the same

I ran the tutorial on the dsvm nc6 and your eval on the local machine with a newer cntk but i don´t think this should be the problem (the model is the same)

Any suggestions ? Thanks

003006

Kind regards

Dirk

nadavbar commented 7 years ago

Hi @DHOFM, thanks for finding the first issue. Was it a specific parameter that made the change? (I saw that you changed the nms threshold), if it was I can add it as a parameter to the API so it will be easier to adjust it.

When you mean pkranen code, do you mean the evaluation notebook that you placed the link in the first post? In that case this is also my code :)

The code is pretty much the same in both cases, so my 2 suspicions will be:

The difference in the envionrnment, perhaps? (Are you running the non notebook code on a GPU? it shouldn't matter, but then again if it is different we probably need to understand why)
Different parameters/ region creation. Are you sure that for the python non notebook code selective search runs?

DHOFM commented 7 years ago

@nadavbar Hi and thanks for the fast response:

The changes from me are described as new the grocery parameters are the described as old (above) The pkranen code is this: https://github.com/Microsoft/CNTK/tree/master/Examples/Image/Detection/FastRCNN or what i thought is from him...

I trained the model on an Azure nc6 with tesla K80 The evaluation that differs is from that machine debugging the scripts A1... A3... and so on The original scripts eval the db Your code is tested on my local machine with a OEM Nvidia GPU (slower) but GPU support is on and works (tested with nvidia-smi while running your code)

The cntk on the local machine is newer than the azure instance install but that should not matter, the model is from the Azure Machine and loads fine Because of our company the local machine runs WIN7, because the company will migrate to WIN 10 later this year, so not all cntk Features will run, but for evaluation it works. I can test your code on the dsvm also. Selective search runs on both machines / scripts.

The tested image is that with the bus and cars, it is in the PascalVoc dataset number 3006.jpg

Kind regards,

Dirk.

nadavbar commented 7 years ago

Yes - I only meant that the notebook (https://github.com/Microsoft/CNTK/blob/master/Examples/Image/Detection/FastRCNN/CNTK_FastRCNN_Eval.ipynb) is from me :) How are you evaluating the image in the first case (which is not the code in this repo), are you using it as part of the test set, and the see the output using the python scripts in the example dir?

DHOFM commented 7 years ago

@nadavbar Hi, yes the original scripts are run and then i compared the stored eval from the corresponding file, that is loaded in nnPredict (cntk_helpers). They are different to the output evaluated in your frcnnDetector: output = self.__model.eval(arguments).

Let me attach 3 Files: First your result ( i did not plot the background labels):

newout

Now the original eval (from B3_Visual...): 1495003006

and and last the zipped numpy that is loaded in the cntk_helpers as i described:

1495.dat.zip

NOTE: I use my trained model from the nc6 Machine not the pretrained, you can download. If you need the model, please let me know and i will store it or send it by mail

Kind regards,

Dirk

DHOFM commented 7 years ago

@nadavbar Hi again, it seems, that i now found the reason for the problem. I always thougt all ROIs are created in the same way, by only varying the parameters, But for the PASCALvoc dataset Microsoft has it´s own class in CNTK called pascal_voc. In this class pre-computed ROIs are loaded from ..\Datasets\Pascal\selective_search_data and then recalculated for use in CNN. You find the Matlab Files in general discussions about RCNNs also. So i wish i had seen this before :) Anyway for evaluation in your code we need the proper parameters for ROI Genetation. Using the Matlab Files in the original code gets a lot of more regions.

Kind regards,

Dirk

DHOFM commented 7 years ago

Some results changing the parameters: 1: original maxImgdim = 200, sscale = 100 2: maxImgdim = 400, sscale = 100 3: maxImgdim = 400, sscale = 10

I only plotted the regions after calling selective search, not the added rois (use_grid_rois):

results

Kind regards,

Dirk

nadavbar commented 7 years ago

Thanks for finding the root cause! I was leaning towards the ROIs as the root source of difference between the results.

If I understand the current limitation is that you need to change the parameters for region creation in the code itself right? I'll add to the backlog the need to add those in a more API-friendly way.

DHOFM commented 7 years ago

Hi @nadavbar

I was leaning towards the ROIs as the root source of difference between the results

Yes - that gave me the input for debugging into that, thanks :)

If I understand the current limitation is that you need to change the parameters for region creation in the code itself right? I'll add to the backlog the need to add those in a more API-friendly way.

No - my problem is, that i did not recognize, that in case of the pascalVoc data, the regions are loaded from the Matlab Files. I am working on a car finding cnn for a showcase. The pascalVoc data is labeled - in case of creating car data on my own, i would need to label the training data. So i would like pascaVoc for training and evaluating. My Problem: Because of regions read from the Matlab Files i do not know how the regions can be calculated, when i evaluate a new picture for getting the best results. Or in other words a document of paramter setting for the region detection that was used for the pascal voc data would be fine.

Kind regards,

Dirk

pkranen commented 7 years ago

Hi. As you mentioned the Pascal ROIs are precomputed, they are not created using our code. The originate from the original paper's repo (see now at https://dl.dropboxusercontent.com/s/orrt7o6bp6ae0tc/selective_search_data.tgz?dl=0CHECKSUM=7078c1db87a7851b31966b96774cd9b9) and we do not know the parameters that were used to create them.

microsoft / CNTK-FastRCNNDetector

Problem with other model than grocery #1

ROI generation

roi_minDimRel = 0.04

roi_maxDimRel = 0.4

roi_minNrPixelsRel = 2 roi_minDimRel roi_minDimRel

roi_maxNrPixelsRel = 0.33 roi_maxDimRel roi_maxDimRel

roi_maxAspectRatio = 4.0 # maximum aspect Ratio of a ROI vertically and horizontally

r#oi_maxImgDim = 200 # image size used for ROI generation

ss_scale = 100 # selective search ROIS: parameter controlling cluster size for segmentation

ss_sigma = 1.2 # selective search ROIs: width of gaussian kernal for segmentation

ss_minSize = 20 # selective search ROIs: minimum component size for segmentation

grid_nrScales = 7 # uniform grid ROIs: number of iterations from largest possible ROI to smaller ROIs

grid_aspectRatios = [1.0, 2.0, 0.5] # uniform grid ROIs: aspect ratio of ROIs

new

ROI generation

new

nms_threshold = 0.1 <==old