scribe-org / Scribe-iOS

iOS app with keyboards for language learners
https://apps.apple.com/app/scribe-language-keyboards/id1596613886
GNU General Public License v3.0
124 stars 76 forks source link

Switch gender annotation over to reference separate lexemes in a loop #401

Open andrewtavis opened 7 months ago

andrewtavis commented 7 months ago

Terms

Description

Scribe will be switching over its data process to be more directly based on one lexemes per data entry. At this time we combine lexemes together based on the individual strings, so in German the word Schild means sign and shield, but is one entry for us. In order to simplify the data formatting process, we'll need to remove this, which further means that the way we store genders will be different.

The current way is that if a string has multiple genders, then we'll store each of them separated by slashed, so F/M/N/C/PL and all the variants. We'll soon have a situation where we'll have one entry for every lexeme and their plural. What this means is that rather then checking to see if the string has a dash in it and then separating it, we'll need to get the gender and check to see if the string/lexeme occurs more time and then append those genders.

Contribution

Happy to discuss the work for this and help with implementation or work on it myself at some point!

andrewtavis commented 3 months ago

@Jag-Marcel, @henrikth93, @wkyoshida, I'm realizing that this is another one that needs to be worked on this summer. We'll be getting the current stages out without the switch of the formatting processes in https://github.com/scribe-org/Scribe-Data/issues/142, but once that is done the data updates wont match the current way the data is accessed in the iOS app. We'll need to check the new outputs and do a quick investigation into what all needs to change, and then those can be mapped out here and gotten to such that this can be released with the switch base translation language functionality (3.2) :)