Open abhigoku10 opened 1 year ago
Hi @abhigoku10 , we appreciate your interest in our work.
We have never attempted similar datasets before. As stated in our paper, Sem2NeRF mainly intends to consider "taking as input only one single-view semantic mask of a specific category". The main reason is that Sem2NeRF relies on a pre-trained NeRF-based generative model to decode the latent code, hence the current version is limited to data that can be modelled by the corresponding generative model, i.e., pi-GAN.
Our suggestion is that you might consider finding a more powerful decoder that generates 3D-aware contents similar to cityscapes, and then re-training our encoder to map the semantic mask to the generated 3D contents. Hope this helps.
@donydchen @QianyiWu thansk for sharing the wonderful work , just wanted to knw if we can test it for images like automotive like cityscapes , bdd100k dataset or have u tested it in your work ? If not should we re-train the model on the new dataset ?
Thanks in advance