TobyPDE / FRRN

Full Resolution Residual Networks for Semantic Image Segmentation
MIT License
278 stars 93 forks source link

Runnin predict without cityscape #22

Open Tetsujinfr opened 7 years ago

Tetsujinfr commented 7 years ago

First, thanks for the new update. I do not have access to Cityscape, how can I use the pre trained models on other images? Not sure how to transform pics and how to input them to the net properly.

thanks

Tets

TobyPDE commented 7 years ago

Hey,

You can feed any image into the network as follows:

BBarbosa commented 7 years ago

Hi @TobyPDE! I tried to follow the steps you mentioned here but the predictions only get 2 classes (related with light pink and black colors). segmentation

I also printed the outout prediction matrix and, from what I understood, it should paint the segmented image with another colors.

[[[2 2 2 ..., 3 2 3]
  [2 2 2 ..., 3 2 3]
  [2 2 2 ..., 3 3 3]
  ...,
  [9 9 9 ..., 0 0 0]
  [8 9 9 ..., 0 0 0]
  [0 0 9 ..., 0 0 0]]]

Edit

I made same changes at the create_color_label_image function on dltools/utility.py and it solved my problem.

With the best regards

daicoolb commented 6 years ago

@BBarbosa Hi, Can you show me your code here ?

daicoolb commented 6 years ago

@TobyPDE I have tried it . but it seems not work

BBarbosa commented 6 years ago

Hi @daicoolb! These are the files I have changed:

I'm using OpenCV 3.3 because it gave me problems trying to load a video file with OpenCV 2.4. Download files from here frrn.zip

hans41 commented 6 years ago

@BBarbosa Thanks for your changes! I have used your new scripts to predict my own data. The "predictions[0]" has now 3 dimensions instead of 2. For example: (1, 800, 1280) vs (800, 1280) Why is that? And with the same input images, if using "mypredict.py" I got different results from what "predict.py" did. It seems the "image" you feed to "pred_fn(image)" is not the same "batch[0]" the author feeds to "val_fn(batch[0], batch[1])".

BBarbosa commented 6 years ago

@hans41 you're welcome. I hope it helped in some useful way. Technically, (1,800,1280) and (800,1280) are equivalent. With regard to the inputs images, i had problems feeding images to the network with the author's provider implementation. So, I managed to adapt one for my own following what @TobyPDE described here . The outputs may differ from mypredict.py to predict.py once there may be some pre-processing operations skipped.
This in an example of what I got. It works really good even in an different angle from what it was trained for. ped