Streaming responses time out

tusharhero / aitelegrambot

aitelegrambot is a telegram bot which uses Ollama as its backend.

GNU General Public License v3.0

34 stars 10 forks source link

Streaming responses time out #3

Open tusharhero opened 2 months ago

tusharhero commented 2 months ago

Streaming responses can be currrently enabled using an environment variable ENABLE_STREAMING_RESPONSE.

The issue is that streaming responses eventually leads to a timeout error by telegram.

The current measure to counter this (streaming every 2 seconds) is inadequate.

masalyuk commented 2 months ago

Ohh. It's rare reproduced with default MESSAGE_CHUNK_SIZE=5. If I set MESSAGE_CHUNK_SIZE equals 1 this issue repeated more often and opposite if I set 20 I don't see any timeout.

So I want to check maybe there is buffer overload in socket. Or it's really telegram respond so long.

tusharhero commented 2 months ago

If setting it to 20 almost stops the timeouts, we might make it the default value of MESSAGE_CHUNK_SIZE.

tusharhero commented 2 months ago

Hey @masalyuk , any progress?

masalyuk commented 2 months ago

@tusharhero Sorry for delay. I want to check something today to find optimal size of MESSAGE_CHUNK_SIZE.

masalyuk commented 2 months ago

@tusharhero Even with a MESSAGE_CHUNK_SIZE set to 10, I don't see any timeout, even if I've asked to write 50 sentences.

Moreover, increasing this parameter decreases the probability of encountering Flood control error or Bad Message error. Therefore, let's change it to 20 to ensure that we won't face such issues.

It is probably a good idea to place an additional warning somewhere to notify users not to use small MESSAGE_CHUNK_SIZE values.

tusharhero commented 2 months ago

Hey @masalyuk,

Even with a MESSAGE_CHUNK_SIZE set to 10, I don't see any timeout, even if I've asked to write 50 sentences.

Interesting, I wonder how you are testing it. Because when I try to test it. It often stops in the middle of inference due to these errors.

Moreover, increasing this parameter decreases the probability of encountering Flood control error or Bad Message error. Therefore, let's change it to 20 to ensure that we won't face such issues.

I have also tried setting the values as high as 150. The error still persists. Maybe the current value is not being read from the configuration file. Can you investigate this?

It is probably a good idea to place an additional warning somewhere to notify users not to use small MESSAGE_CHUNK_SIZE values.

That sounds like a good idea. That should be mentioned in docs/setup.md.

tusharhero commented 2 months ago

I think we may be able to take some inspiration from here: code from ruecat/ollama-telegram.

        async for response_data in generate(payload, modelname, prompt):
            msg = response_data.get("message")
            if msg is None:
                continue
            chunk = msg.get("content", "")
            full_response += chunk
            full_response_stripped = full_response.strip()

            # avoid Bad Request: message text is empty
            if full_response_stripped == "":
                continue

            if "." in chunk or "\n" in chunk or "!" in chunk or "?" in chunk:
                if sent_message:
                    if last_sent_text != full_response_stripped:
                        await bot.edit_message_text(chat_id=message.chat.id, message_id=sent_message.message_id,
                                                    text=full_response_stripped)
                        last_sent_text = full_response_stripped
                else:
                    sent_message = await bot.send_message(
                        chat_id=message.chat.id,
                        text=full_response_stripped,
                        reply_to_message_id=message.message_id,
                    )
                    last_sent_text = full_response_stripped