blockentropy / ml-client

Machine Learning Clients for Open Source Infra
8 stars 3 forks source link

Raw message #11

Closed isamu-isozaki closed 5 months ago

isamu-isozaki commented 5 months ago

This pr introduces partial generation which allows users to generate from partially done assistant output. This is mainly helpful for constrained generation because currently, it's possible when prompting the AI with "Say Action Input: 'hi' Action Input: " for the output to not be 'hi' because of formatting for each language model where the above becomes an user prompt. To fix this issue I added a way to directly move part of the prompt after the formatting with a new parameter in request body partial_generation. Additionally, I did some minor fixes so that the global variables are not ignored and the server can completely reset. I tested locally and it worked

edk208 commented 5 months ago

oh that's an awesome feature