nekomeowww / insights-bot

A bot works with OpenAI GPT models to provide insights for your info flows.
MIT License
248 stars 24 forks source link

bug: empty username picked when first name and last name have more than 10 chars #50

Closed nekomeowww closed 1 year ago

nekomeowww commented 1 year ago

fix pkg/tgbot/utils.go

rafiramadhana commented 1 year ago

@nekomeowww Is this fix related to FullNameFromFirstAndLastName?

https://github.com/nekomeowww/insights-bot/blob/53ddb314179847a7d8e83159025fd464ec75dde3/pkg/bots/tgbot/utils.go#L64

In addition, would you mind to provide an example of input (i.e. the first name and last name with >10 chars) related to this case?

Lastly, what are the expectation? Trimming the output?

Thanks!

nekomeowww commented 1 year ago

Hey, welcome to this repo and contribute!

@nekomeowww Is this fix related to FullNameFromFirstAndLastName?

https://github.com/nekomeowww/insights-bot/blob/53ddb314179847a7d8e83159025fd464ec75dde3/pkg/bots/tgbot/utils.go#L64

I checked the source code and find out I miss stated the package that needs to be fixed. The correct one should be: https://github.com/nekomeowww/insights-bot/blob/53ddb314179847a7d8e83159025fd464ec75dde3/internal/models/chathistories/chat_histories.go#L159

In addition, would you mind to provide an example of input (i.e. the first name and last name with >10 chars) related to this case?

The related edge cases should be the following:

Case One

Over 10 chars of first_name but empty last name and empty username.

{
  "first_name": "blah blah | SomeSubName :> | 😄",
  "last_name": "",
  "username": ""
}

Case Two

Over 10 chars of first_name but with only empty last name.

{
  "first_name": "blah blah | SomeSubName :> | 😄",
  "last_name": "",
  "username": "example_username"
}

This function is used to match the very long name I have seen in Telegram initially, some of the people would use the blah blah | SomeSubName :> | 😄 as their first name (or with last name combined), usually they would have a shorter username that would save more tokens (which is the case 2). However, I encountered one people with simple Example Name first name but without username being set, so the function would simple return a non-exist username and break the context.

Lastly, what are the expectation? Trimming the output?

The answer to this question is: probably yes?

You might have to come up a idea about how to trim the very long first name in case 1 while still covering case 2.

rafiramadhana commented 1 year ago

@nekomeowww what if full name (first name + last name) is over 10 chars and username is longer than full name?

something like:

first_name: aaaaa
last_name: bbbbbc
username: aaaaabbbbbccccc

are we still returning the username?

usually they would have a shorter username that would save more tokens

because it looks like you prefer to return shorter string here

nekomeowww commented 1 year ago

@nekomeowww what if full name (first name + last name) is over 10 chars and username is longer than full name?

something like:

first_name: aaaaa
last_name: bbbbbc
username: aaaaabbbbbccccc

are we still returning the username?

usually they would have a shorter username that would save more tokens

because it looks like you prefer to return shorter string here

Yes, it is better to have the less one. Therefore it is not strictly to say we have to return username or fullname, it is ok to return the shorter one directly. If the returned one is still to long (such as 20 runes), we might have to trim the string off.

rafiramadhana commented 1 year ago

Therefore it is not strictly to say we have to return username or fullname

OK. I think it is clear enough.

rafiramadhana commented 1 year ago

@nekomeowww Would it be OK if we remove spaces from the fullname?

"my name has spaces" -> "mynamehasspaces"
nekomeowww commented 1 year ago

@nekomeowww Would it be OK if we remove spaces from the fullname?


"my name has spaces" -> "mynamehasspaces"

If you are talking about the space between first name and last name, pkg/bots/tgbot.FullNameFromFirstAndLastName should have done the jobs for you. If you are talking about the spaces within the full name, I think it is not ok, because the full name will eventually be displayed in messages.

nekomeowww commented 1 year ago

How about just return the full name if it exceeded the 10 char limits and username is empty for now?

There are planning features (#47, #48) to process the user info more elegantly. We can discover and come up with more better ideas to improve the processing logics for those issues.

rafiramadhana commented 1 year ago

@nekomeowww Sounds good to me. We can implement it that way.