rustformers / llmcord

A Discord bot, written in Rust, that generates responses using the LLaMA language model.
GNU General Public License v3.0
91 stars 14 forks source link

Add Embed message and increase the MESSAGE_CHUNK_SIZE to 4096 #20

Closed pabl-o-ce closed 1 year ago

pabl-o-ce commented 1 year ago

Hi,

Love the repo and your work.

I add the embed message in a fork that I made here.

To increase the MESSAGE_CHUNK_SIZE to 4096.

I already test it and work amazing :)

let me know if you accept the PR or you can take the idea to increase the size of the message. (currently learning Rust so any feedback is well received also)

philpax commented 1 year ago

Hi there!

I'm surprised that works - the Discord API message length limit by default is 2000 (I picked 1500 as a generous buffer). Is your bot special in some way such that the limit is longer?

pabl-o-ce commented 1 year ago

Based in the docs embed messages can have more length up to 6000 but with some distribution in the components on embed message.

For example I'm using in the description field for set everything there that has 4096 characters if this exceed use your function to create another embed message.

Right now in title field I put the stone head 🗿 because I like the icon but there can be just something that said Answer like this photo above.

Screen Shot 2023-07-16 at 7 36 20 PM

I'm ready to send the PR if you like it :)

philpax commented 1 year ago

Ah, I see - I didn't realise that you'd changed it to embeds, which do have larger lengths. I might add embeds as an option, but I don't want to replace traditional replies entirely yet as I'd like to support conversations as well (#11 etc).

pabl-o-ce commented 1 year ago

No problem, It was just only a suggestion.

Embeds can also work on conversations too. The issue I have exp on conversation is that at the 6 question more or less start getting bad answers on ggml and gptq models. A friend told me that fp16 models is better for handling more large conversations [but this model is hardware expensive].

That's why I like just sending one question only. I was enjoying this idea 💯

Thanks for your amazing work on llm.

philpax commented 1 year ago

No worries, thanks for bringing this up! I've just created #21 to formalize this as a feature request, because it seems to be working quite well for you 🙂

Yeah, the conversational models can be quite finicky. The existence of Llama-2 and the context length extension methods (SuperHOT, etc) should hopefully help with generation quality in future, though - just need support in llm 🚀