microsoft / LLMR

MIT License
58 stars 17 forks source link

Potential for a halting problem? #7

Closed hoonsubin closed 3 months ago

hoonsubin commented 5 months ago

Hello!

The research paper "LLMR: Real-time Prompting of InteractiveWorlds using Large Language Models" led me to this repo. I really enjoyed going through the project!

I have a question regarding the Builder-Inspector paradigm. The paper mentions that adding the relevant skills can reduce the back-propagation process, but is the Inspector module's correctness deterministic? I assumed that generative AI could be non-deterministic by nature (or, I like to say, less-deterministic), but if the code generated by the Builder goes through the Inspector until there is no error, isn't there a potential for a halting problem? How do you deal with this issue?

Apologies if this is a stupid question, I'm still learning all of this :)

Kappa666 commented 5 months ago

In principle you're right, but we've decided to be practical and set a maximum number of iterations and default to about 3. In practical terms, you just have to accept that some tasks might not be resolved and that perhaps have to be prompted differently.

Ideally though, you'd probably want something that intelligently decides that this loop has been going a while and that some other insights might be necessary.

hoonsubin commented 5 months ago

In principle you're right, but we've decided to be practical and set a maximum number of iterations and default to about 3. In practical terms, you just have to accept that some tasks might not be resolved and that perhaps have to be prompted differently.

Ideally though, you'd probably want something that intelligently decides that this loop has been going a while and that some other insights might be necessary.

Thanks for the response!

I see, so this is a potential corner case. I feel like adding an extra module (or modifying the Planner) to convert user prompts into a markup language or a custom DSL might increase the reliability of the generated output in exchange for operational flexibility. Relying on NLP can give unexpected results at all levels, in my opinion, but I would love to hear your thoughts on this too!

P.S., I was inspecting this framework to see if we can apply it to our new 2D game's custom playable map design. So I'm very curious to learn more.