chenguolin / InstructScene

[ICLR 2024 spotlight] Official implementation of "InstructScene: Instruction-Driven 3D Indoor Scene Synthesis with Semantic Graph Prior".
https://chenguolin.github.io/projects/InstructScene
MIT License
81 stars 11 forks source link

Can I input my instruction to synthesis 3d-scene? #2

Closed hmmdxzz closed 6 months ago

hmmdxzz commented 6 months ago

Can instruction be generated only from datasets? And can I generate and visualize 3D scenes by inputing my instruction? 代码里通过分析布局中的家具,以位置关系生成指令,我该怎么输入自己的位置关系描述来生成场景?

chenguolin commented 6 months ago

We create instructions using templates based on the dataset information for training and quantitative evaluation: https://github.com/chenguolin/InstructScene/blob/d6950e929de77e26e07acbe5909269bc8252f827/src/train_sg.py#L230

However, you can provide your instructions during the inference by replacing texts, which is a list of text instructions: https://github.com/chenguolin/InstructScene/blob/d6950e929de77e26e07acbe5909269bc8252f827/src/generate_sg.py#L333

For example, in the stylization task, we construct different forms of instructions: https://github.com/chenguolin/InstructScene/blob/d6950e929de77e26e07acbe5909269bc8252f827/src/stylize_sg.py#L296

That is because we utilize the CLIP text encoder, which has been pretrained on a large-scale image-text dataset and demonstrates some degree of generalizability, to extract features from text instructions

If your instructions differ significantly from those used during training (e.g., Put/Position a xxx to the left/right side of a yyy), it would be more effective to retrain the model using your specific text-scene dataset.