SynCode for fill-in-the-middle models

uiuc-focal-lab / syncode

Efficient and general syntactical decoding for Large Language Models

MIT License

198 stars 16 forks source link

SynCode for fill-in-the-middle models #82

Open AzizCode92 opened 6 months ago

AzizCode92 commented 6 months ago

First of all, thank you for this great work! I would like to know if Syncode can also work with fill-in-the-middle models? If yes, how?

shubhamugare commented 6 months ago

Hi @AzizCode92,

SynCode doesn't support FIM models at this point. Do you have any specific application in your mind?

AzizCode92 commented 6 months ago

Thanks @shubhamugare. Not really. I was just wondering how Syncode would find the most probable next token if we have prefix, suffix parts alongside with grammar rules. Right now, testing it for a fim model, it looks like the model is following the grammar by heart. It starts always by the first terminal I define in my grammar. IMH, I think it would take the prefix into consideration and produce the next token from there and at the same time respect the grammar rules. It is a challenging task I guess 😅

shubhamugare commented 6 months ago

Yeah, it is definitely not straightforward extension of what currently exists within SynCode. In theory, one can use the constraints-based on suffix (in addition to the constraint from the prefix which SynCode currently uses) to find what is an appropriate stopping criterion for the MIM model.