DavidGillsjo / SRW-Net

Semantic Room Wireframe Detection from a single perspective image
MIT License
72 stars 9 forks source link

Prediction quality #4

Closed ariel415el closed 2 years ago

ariel415el commented 2 years ago

Hi tried this repo on some random room/corridor images from the net but I'm not sure if what I'm getting are the expected results as they are inferior to the example in the Readme. Can you share your opinion about the prediction quality?

output output (1) output (2)

DavidGillsjo commented 2 years ago

Hi, The image in the README is an annotated image (as stated in the caption) used to train the network. Please see the paper for results.

As with all Neural Networks the method will perform worse when faced with data outside of the training distribution. The network is trained on synthetic data and I expect the performance will vary on real world images.

To tune the result you may adjust the PIXEL_MEAN and PIXEL_STD in the config file layout-SRW-S3D.yaml to better match the image pixel value distribution of your data. You may also decrease the SCORE_THRESHOLD to visualize more predictions in the images. All lines are returned in the json file together with their score.

If you want to go further you may look into papers on Domain Transfer or try to find a real world dataset with annotations you may use to fine-tune the performance using Transfer Learning.

To comment on the performance in these particular images I would say that

  1. doesn't look that good, maybe the lamp is confusing.
  2. Looks good to me, you may lower the SCORE_THRESHOLD to visualize the ceiling lines as well.
  3. OK. The doors lines seem to generate a too many false positives, you can probably decrease SCORE_THRESHOLD to see the missing ceiling and floor lines.

The intended use of the method was to generate detections that could be used in a multi-view setup, so while each image does not give a full model, we hope to use detections from multiple images to generate a complete room model.

I hope that answers your question.

ariel415el commented 2 years ago

HI @DavidGillsjo,

Thanks for the detailed respons.

The reason I'm asking this is I'm building a Replicate (https://replicate.com/) demo for this repo where pople can run predictions online with your model and I want to make sure that the result they get are as intended.

You can view the in progress demo here https://replicate.com/davidgillsjo/srw-net and Once I finish I will add a pull request here with a the short demo script that runs online.

As you may see, the demo makes allow controlling the confidence threshold but for the samples I attached above it didn't help much.

Do you think substructing the ImageNet mean and std instead or the input statisctics would be better here?

DavidGillsjo commented 2 years ago

Wow, that looks neat! Regarding the mean and std I'm not sure ImageNet would be representative either. I think an indoor dataset like ScanNet or Matterport would be better.

I'll close this and answer the PR.