Gryphire / wisperbot

Telegram bot for Wisper
0 stars 0 forks source link

Confirm voicenotes with user #36

Closed lydgate closed 4 months ago

lydgate commented 7 months ago

At several stages we want a user to send a single voicenote ideally.

There could be situations where they don't send one and might need to be reminded.

There could also be situations where they want to send multiple voicenotes. To prevent several voicenotes from being sent in, we will send a question back to them once the first voicenote has been received.

"Are you happy with this recording? Can you confirm this is your ?"

If they indicate no, might need a sub-menu. Option to re-record or append. If they append, could use ffmpeg to concatenate and end up with one?

lydgate commented 5 months ago

Estimate 1 day

lydgate commented 4 months ago

Spent another hour on this, I cannot think of any way to do this without completely refactoring the bot. The issue is that we currently handle receiving a voicenote "inside" a function which corresponds to a state, for example tut_story1 which corresponds to state TUT_STORY1.

I can't think of any way to handle a loop except to have it at the level of ConversationHandler, which means each function like TUT_STORY1 would need to enter another conversation state where it's awaiting voicenotes, and continue returning to that state until it's finished, and then run some other code to combine the voicenotes into a single voicenote. This would mean doubling the number of states and having a bunch of quite redundant code to have it loop before proceeding.

Possibly it could be made slightly less onerous by centralizing the function to collect voicenotes, but then we would need a manual way for the bot to figure out what "state" it is in (since currently there are more "statuses" than ConversationHandler "states"; there is no way to look up what "state" we are in at the moment). So code would need to be added into every state so that it knows what is the "next" state, so it can return that value instead of, say, the RECEIVING_VOICENOTES return code.

Either option will make the bot more brittle, because it means that voicenotes can only be received if the bot is in the correct "state." And that the bot will need to correctly exit that state at the right time. Also it will mean possibly manually indicating which functions correspond to which states (which is ugly but possible).

This is a direct consequence (one I warned about) of handling the message transfer outside of a group. It means that we're effectively re-implementing "plumbing" features of Telegram itself, since they have built a way of sending/deleting messages, which we are basically reimplementing in a more complicated way.

lydgate commented 4 months ago

I guess a final hacky option would be calling "get_voicenote" with the return code value of the current state (and then inside get_voicenote it could calculate what the next state would be). This would mean rewriting every line that calls get_voicenote to handle the return code but it might be the simplest (if fairly manual) way to allow the bot to decide what to return as it loops?

lydgate commented 4 months ago

Here's how the third option (which is complicated but maybe not as complicated as the other options) would look... We'd need to figure out how/where to track the list of voicenotes we're receiving and also how to concatenate them and track the concatenated version in the database. I'd estimate it would take a few days to do.

Current code:

async def tut_story1(update, context):
    '''The user is supposed to send a response at this point; if not, ask them to'''
    chat = await initialize_chat_handler(update, context)
    if update.message.text:
        await chat.send_msg("Please send a voicenote response 😊")
    else:
        await get_voicenote(update, context, TUT_STORY2)
        chat.status = f'tut_story{chat.week}responded'
        await chat.send_msg(f"Here's the second tutorial story for you to listen to, from someone else:")
        await chat.send_vn(VN=f'tutorialstories/{tutorial_files[1]}')
        await chat.send_msg(f"""Again, have a think about which values seem embedded in this person's story. When you're ready to record your response, go ahead!""")
        chat.status = f'tut_story2received'
        return TUT_STORY2

Looping code (would need to be done every time we call get_voicenote)

async def tut_story1(update, context):
    '''The user is supposed to send a response at this point; if not, ask them to'''
    chat = await initialize_chat_handler(update, context)
    if update.message.text and update.message.text != '/done':
        await chat.send_msg("Please send a voicenote response 😊")
    elif update.message.voice:
        await get_voicenote(update, context, TUT_STORY2)
        # Need code to add the voicenotes to a list of some sort
    elif update.message.text == '/done':
        # Need code to combine the voicenotes into a single voicenote
        chat.status = f'tut_story{chat.week}responded'
        await chat.send_msg(f"Here's the second tutorial story for you to listen to, from someone else:")
        await chat.send_vn(VN=f'tutorialstories/{tutorial_files[1]}')
        await chat.send_msg(f"""Again, have a think about which values seem embedded in this person's story. When you're ready to record your response, go ahead!""")
        chat.status = f'tut_story2received'
        return TUT_STORY2
lydgate commented 4 months ago

The three options are:

  1. Handle looping by separating each ConversationHandler state into two stages, one of which loops and waits for voicenotes and the second which preps the next state.
  2. Handle looping by creating a new ConversationHandler state which only handles voicenotes, then enter that state and have it loop but exit into the correct "next" state.
  3. Handle looping by letting each state decide when it is done, by the user manually running a "/done" command which is handled within each state.

Option 1 is the most "explicit" but very "verbose." It will make the code much longer but could be worth it just to be super clear about where each thing is happening.

Option 2 is probably the most "concise" but also quite "abstract." It would require manual tracking of exactly where we are in the conversation.

Option 3 is a bit hacky but simple and probably the fastest to implement, but it requires rewriting every time we call get_voicenote (option 1 also requires this).

Gryphire commented 4 months ago

Hmm, okay. Which of the options do you think we should go with, given that we're trying to balance both stability and time? If you want to call about this, let me know.

As an aside; the multiple voice notes would not necessarily need to be concatenated into one as far as I'm concerned; it's fine if people just receive two separate voice notes.

lydgate commented 4 months ago

Test version pushed to this branch, which goes with option /3 as it was the simplest.

It took some time as well to refactor the code to send multiple voicenotes. This is working for the intros, but the same code doesn't seem to work for the later stages, so the refactor continues.

lydgate commented 4 months ago

Believe this should work now for the later steps as well; test version in branch.

lydgate commented 4 months ago

Now only allows users to send 1 or max 2 voicenotes; if they send 2, it proceeds automatically without them having to run /done.