ywyue / RoomFormer

[CVPR 2023] RoomFormer: Two-level Queries for Single-stage Floorplan Reconstruction
https://ywyue.github.io/RoomFormer/
MIT License
160 stars 22 forks source link

About the performance on outdoor scene. #22

Closed HuQ1an closed 2 weeks ago

HuQ1an commented 1 month ago

Nice work! Thank you for releasing your code. Have you tried Roomformer on OUTDOOR dataset? Such as the cities_dataset used in HEAT, and how is the performance? Currently i am trying to use the Roomformer as a baseline model on the outdoor reconstruction task, the label is also in MSCOCO form. Except for necessary changes (including adapt the model to three channels input, change image size), I didn't tweak any other parameters of the model. However, roomformer performs quite worse on these satellite images.

I think a possible reason for this is that the number of corners per polygon and the number of polygons varies. But I tried to increase them and didn't get desired results. Could you please give me some hints about the poor performance on outdoor dataset? Thank you very much!

ywyue commented 1 month ago

Hi, thanks for your interest in our work!

Yes, I gave a quick try of Roomformer on the CrowdAI Mapping Challenge dataset (a long time ago) and it indeed gave reasonable results. Outdoor building reconstruction has challenging aspects, e.g., those satellite images are not perfectly in orthographic view, and building are usually occluded by trees or shadows. Theoretically, the model can still learn to reconstruct the building boundary based on the global context of the images.

Please make sure to set --num_polys >= maximum of building numbers in an image and --num_queries >= --num_polys × maximum of corner numbers in a building. There may be curved walls in outdoor dataset which contains many corners. I would suggest to simplify the shape (i.e. reduce the number of corners in curved walls) to avoid setting a very large --num_queries .

Also for sanity check, please first overfit the model on a single image. It should give (almost) perfect prediction if no other issues exist.

HuQ1an commented 1 month ago

Thanks for your reply! I have set --num_polys >= almost the maximum building numbers in dataset, and also the --num_queries. The room_prec and corner_prec metrics are quite poor, about 10.0. From the visualization results, the model seems can locate the building, but can not precisely predict the corners. Do you have some idea about this results?

output gt_labels BTW, what do you mean about 'sanity check'? I have tried to train the model on one image and adjusted the learning rate, but the model doesn't seem to converge properly. Thanks again for your kindly help.

ywyue commented 1 month ago

Hi @HuQ1an, here are some results which I obtained in a quick experiment on the CrowdAI Mapping Challenge dataset a long time ago.

I didn't calculate the room_prec and corner_prec metrics but instead the AP, AR, and IoU / C-IoU. I also shared the results here in case they are helpful to you. I didn't fine-tune model parameters and the model was not trained until convergence. More parameter tuning and longer training are expected to improve the results further.

Some predictions at epoch 12: image

Metrics at epoch 32: image

By 'sanity check', I mean to make sure there are no issues in the code, e.g. your dataloader, evaluation. If everything is fine, the model should be able to overfit on a single image and give almost perfect prediction. After that, it will make sense to scale up the training to the whole training set. Feel free to let me know if you have further questions.

HuQ1an commented 1 month ago

Hi @ywyue , Thanks for your reply and patient help! Your prediction results seems quite reasonable compared to mine. Are the predictions you show in epoch 12 trained on the entire crowdai training dataset? Or just on the official small training dataset? Compared to the model used in floorplan reconstruction, did you make some modifications to the code that are specific to OUTDOOR scenes? And if possible, would you mind sharing me the code you used in CrowdAI dataset? It would be a great help to me!!!

HuQ1an commented 1 month ago

Hi @ywyue , I have one more question, the roomformer can predict the confidence score for each corner but not the confidence score for each building instance, so how do you calculate the AP, AP50.... metric? Thanks for your help!

ywyue commented 1 month ago

Hi @HuQ1an, sorry for the late reply! A busy week.

I think the predictions I showed in epoch 12 were trained on the entire CrowdAI training dataset. I am not aware that they also have an "official small training dataset".

I didn't change too much on the code. Because the CrowdAI dataset also provides coco format annotations, it fits naturally to the current dataloader (Maybe only little changes are required). I can find the code for the CrowdAI dataset but it is not clean. If you want, just leave your email and I will share that with you.

I follow the evaluation code from PolyWolrd to calculate the AP, AR, and IoU / C-IoU. I set all the scores as 1, same as PolyWorld does - you can also check their prediction format here.

HuQ1an commented 1 month ago

@ywyue Hi!my email is huq1an@whu.edu.cn

Thanks for your help!

ywyue commented 2 weeks ago

@HuQ1an Shared via email. Feel free to reach out if you have more questions. Closed for now.