microsoft / LLMR

MIT License
41 stars 12 forks source link

LLMR: Real-time Prompting of Interactive Worlds using Large Language Models

Introduction

This repo contains the code described in LLMR, implementing the Large Language Model for Mixed Reality framework.

This package serves as a prototype for "Speaking the world into existence", which allows the real time creation of objects, tools, and scenes with visual, behavioral, and interactive elements through natural language. Our framework combines prompt-based generation with Unity to enable spontaneous user creation at run time, a core element of VR since its inception. The package comes with several demo scenes for you to experiment with: an empty playground in which you can create objects in isolation and scenes that engage Dall-E and CLIP to find 3D models that are visually and textually similar to your prompt.

Example use cases:

Installation

Setting up a Unity Project

The project has been tested on Unity versions 2021.3.25f1 and 2022.3.11f1 and has several dependencies. The largest of these is the Roslyn C# compiler. For this project we used the implementation by Trivial Interactive, which you have to purchase on the Unity Asset store C# Compiler. This project could be adapted to depend on the open-source implementation of C#, but would require further implementation of attaching compiled code to GameObjects that this implementation comes with. Thus, to get this project to function, you need to add the compiler to it.

Once the compiler has been added, the project should be ready as is. Note that the the run-time compilation of code generated by LLMs is dependent on preloaded assembly DLLs, which we attached in the Assemblies folder. To get these to register in the app, you need to create Assembly Reference Assets with the help of the Trivial Interactive package.

If you would like to add this functionality to an existing project, you will need to export all of the elements as a package (including the compiler) and follow these steps:

Also copy over lines 48-59

,
"scopedRegistries": [
{
"name": "OpenUPM",
"url": "https://package.openupm.com",
"scopes": [
"com.openai",
"com.utilities",
"com.atteneder"
]
}
]

and append them as a new field after dependencies.

Python flask app for CLIP-DallE integration

You will need to have Python 3.9 or higher in your system. You can use python or conda environments to install the requirements.

After you clone the repo, create a python environment inside the local folder:

cd CLIP-DallE-SketchFab-001
python -m venv .venv
source .venv/bin/activate

Then, install the list of requirements:

pip install -r requirements.txt

You can now run the flask app:

flask run 

In your terminal, you should see something like the image below. The app is now running on your local server and you can go back to Unity and enable the use of this app.

1689794272556

If you encounter any import or dependency issues, you can run python app.py to see the error messages. Install any missing dependencies as needed.

Demo Scenes

Empty Playground

Remarks:

Dall-E & CLIP Refinement for 3D Object Creation

1689793997354

Set Up for CLIP-DallE-SketchFab-001 Flask App

First clone the repo for the flask app or unzip the folder "CLIP-DallE-SketchFab-001-main.zip."

git clone https://github.com/delamarifer/CLIP-DallE-SketchFab-001.git

This flask app integrates several AI models to create and search for 3D models based on natural language prompts. It can be used inside Unity to generate scenes or objects from text descriptions. The app provides the following functionalities:

The demo_use_app.py file demonstrates how to use all three functionalities.

Remarks

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Data Privacy Notice

Please see: Data Privacy Notice