SceneGenAgent is an LLM-based agent for generating industrial scenes through C# code, aimed at meeting the demand for precise measurements and positioning in industrial scene generation. SceneGenAgent ensures precise layout planning through a structured and calculable format, layout verification, and iterative refinement to meet the quantitative requirements of industrial scenarios. Experiment results demonstrate that LLMs powered by SceneGenAgent exceed their original performance, reaching up to 81.0% success rate in real-world industrial scene generation tasks and effectively meeting most scene generation requirements. To further enhance accessibility, we construct SceneInstruct, a dataset designed for fine-tuning open-source LLMs to integrate into SceneGenAgent. Experiments show that fine-tuning open-source LLMs on SceneInstruct yields significant performance improvements, with Llama3.1-70B approaching the capabilities of GPT-4o.
This repository is the code for deploying SceneGenAgent as a Gradio application and training models with SceneInstruct to be integrated into SceneGenAgent.
git clone https://github.com/THUDM/SceneGenAgent.git
cd SceneGenAgent
pip install -r requirements.txt
To run the SceneGenAgent demo with API-basd or offline models, please refer to the instructions in inference.md.
To train models with SceneInstruct as the backbone of SceneGenAgent, please follow the instructions in train.md.
To reproduce the procedure of building the SceneInstruct dataset, please follow dataset.md.
We evaluate the performance of industrial scene generation by curating a benchmark containing scene descriptions written by engineers and manually checking the correctness of each generation. Here are our evaluation results:
To run generation on our benchmark, please follow the instructions in inference.md.
If you find our work useful, please cite using this BibTeX:
@article{xia2024scenegenagent,
title={SceneGenAgent: Precise Industrial Scene Generation with Coding Agent},
author={Xiao Xia and Dan Zhang and Zibo Liao and Zhenyu Hou and Tianrui Sun and Jing Li and Ling Fu and Yuxiao Dong},
year={2024},
eprint={2410.21909},
archivePrefix={arXiv},
primaryClass={cs.CL},
}