astra-vision / MonoScene

[CVPR 2022] "MonoScene: Monocular 3D Semantic Scene Completion": 3D Semantic Occupancy Prediction from a single image
https://astra-vision.github.io/MonoScene/
Apache License 2.0
702 stars 69 forks source link

Inference on other datasets #58

Closed hanzh0816 closed 1 year ago

hanzh0816 commented 1 year ago

Hello authors, first of all thank you for your great work and open source initiative. I want to apply your work on the CVUSA dataset, but I tested it with the demo you put on the hugging face, and the result is not good, can you please help me with this issue? Could you please provide me with some guidance on how to improve the performance of the model?

anhquancao commented 1 year ago

Hi @Alex-Hanzh, you need to retrain the method on your dataset. The model on HuggingFace demo is trained on SemanticKITTI dataset. Thus, it overfitted to the camera parameters and the scenery of SemanticKITTI.

hanzh0816 commented 1 year ago

@anhquancao Thanks for your reply! Due to the lack of 3D semantic annotation in the CVUSA dataset, I am confused about how to train the model on this dataset. Can I improve the performance of the model on this dataset by modifying camera parameters? If possible, could you please give me some guidance on which parts to modify

anhquancao commented 1 year ago

Hi @Alex-Hanzh, The current method requires 3D annotation. I don't think it can work without retraining

hanzh0816 commented 1 year ago

Hi @Alex-Hanzh, The current method requires 3D annotation. I don't think it can work without retraining

Got it! Thanks for your help