pangeachat / client

Learn a language while texting your friends
https://krille-chan.github.io/fluffychat/
GNU Affero General Public License v3.0
1 stars 2 forks source link

Detected Language used when storing analytics data #385

Closed ggurdin closed 1 week ago

ggurdin commented 1 week ago

In the combined spaces branch, we ran into an issue where some student's analytics data was missing when teachers clicked through the different target languages in space analytics.

I think what happened is that the user has auto-igc turned off and sent messages without IGC, which caused language detection to run, and the originalSent's lang code for those messages was set to whichever language was detected (in this user's case, French).

With how I have analytics set up, after moving away from class-level language settings, the langCode of a given message (which is used to determine which analytics room the data is sent to) comes from the originalSent rep. If that's not available, it falls back to the user's current target language. The reason I set it up this way is that a user can send some messages in their target language, then switch their target language and send more messages, before sending new analytics events (so all the data would go to the analytics room for whichever target language the user has set at the time that the event is sent).

I can see many potential solutions to this -

  1. One easy thing would be to send analytics events before the user changes their target language, so a situation like the one I described above wouldn't occur. Then all the collected messages could just be sent to the analytics room for the user's target language.
  2. Alternatively, for a fix on the sending end of the problem, the system could only use the originalSent rep's langCode if choreo is not null (a null choreo record would mean that the langCode came from the language detection endpoint), and in those cases, fallback to the user's target lang
  3. Or, for a fix on the receiving end of the problem, the language dropdown in space analytics could pull its values from student's analytics rooms, rather than just using the list of available target languages

@wcjord any thoughts on this? Let me know if this explanation makes sense. It doesn't seem like the kind of thing that would happen very often, but maybe it is.

ggurdin commented 1 week ago

One benefit of pulling the values for that language dropdown in space analytics from student's analytics rooms is that it wouldn't show languages without data attached to them. Right now, most spaces won't have any Portuguese data, but it's still a choice in the dropdown menu.

ggurdin commented 1 week ago

Go with option 3, and default to option with the highest number of analytics rooms attached to it