Open dlsf opened 9 months ago
Pretty cool! I've felt the same way but never knew how to put it into words – regardless of the language 😁
Just realized that this approach does not work with Katakana very well. If the question contains Katakana and you enter Hiragana, it would reject the answer. I will look into it later.
Alright, I found a solution, even if it isn't as pretty as before. I took the easy route and converted all Katakana in both the answer and subject characters to Hiragana so they can be compared.
This requires the introduction of a new utility method, since the app currently only has a method for converting Hiragana to Katakana. It does the same, except in reverse and also for half-width Katakana. I tested it for every single Katakana character; it even works with half-width Katakana, odd Katakana combinations, as well as special characters such as ー (even if the latter is a bit hacky, see screenshot below).
It works in all cases, but in most cases, it even works as you would expect:
I have been doing the majority of my Wanikani reviews with this version of Smouldering Durtles for the past week, and would consider it well-tested at this point
Alright, I found another edge case, which made me go through the entirety of Wanikani's 6000 vocabulary to find more. This should be all of them.
This is good work and well done for going through and doing all that work, however. I'm reticent to merge this without an off-by-default config setting created for it.
My reasoning being:
1) This differs from the Wanikani browser behaviour, and I don't want to present a default user experience that differs from Wanikani's handling to that large of a degree 2) Personally, I always use single character entry as a 'don't know' and so I don't want to remove that ability
Alright, I found another edge case, which made me go through the entirety of Wanikani's 6000 vocabulary to find more. This should be all of them.
Just in reference to this, how resilient is your method to future vocab additions by Tofugu?
I'm reticent to merge this without an off-by-default config setting created for it.
Sure, I can look into that. In what category should it be placed, and what should the default value be?
Personally, I always use single character entry as a 'don't know' and so I don't want to remove that ability
It's the same for me, but I adjusted pretty quickly. I understand the concern though, a config option would be best.
Just in reference to this, how resilient is your method to future vocab additions by Tofugu?
I honestly can't provide a definite answer to this question, as I'm simply lacking knowledge about the Japanese language and am only level 22 on Wanikani. Considering that it works with all 6000+ vocabulary currently on Wanikani, the chances of it failing seem very slim as, to my knowledge, the number of characters that have an unusual reading are very low. 々 is considered a Kanji, ヶ being read as か in counters has been accounted for now (it only breaks if it is at the end or beginning of a word, which I should probably fix, although Wanikani currently has no such vocabulary), and 〜 as a special counter prefix in Wanikani is ignored. Unless there is a similarly special Japanese character I don't know about or Wanikani introduces another special character like 〜, I doubt that this will break.
Alright, I had to make one final adjustment. It affected exactly one vocabulary on Wanikani (2011年), and I'm very lucky that I got the review today.
Allow me to explain how I missed this. I took the data from https://www.wkstats.com/items/wanikani, which allowed me to easily copy all Wanikani vocabulary separated by a comma for testing purposes. However, since I had to remove the text between the actual vocabulary items for each level, I forgot to put a comma between the first and last vocabulary items of the levels. 2011年 just so happened to be the first item of level 4, which interfered with my test. Since the word uses Japanese numerals, it expected Japanese numerals in the input as opposed to Hiragana.
I now fixed the error in the code & test, once more went through every single character manually and programmatically to confirm that only Regex (.+
) and Hiragana remain for the check, and am now absolutely confident that this logic covers everything that's currently on Wanikani. I should have checked it programmatically from the beginning. My last remark regarding how future-proof this approach still holds true. Sorry for this mistake; the circumstances were quite unfortunate.
Hi @dlsf just to note that I'm thankful for the work you've done on this - I'm happy with the behaviour you've put in, but as previously mentioned I'd like to make it an advanced setting and off by default. I'm going to leave the PR open for the time being, just until I'm in a position to work on it and add the setting in.
@jerryhcooke Thank you for the update! I was about to add a setting myself and let you review the change, but this also works. I'm still using this feature for my personal reviews, and there haven't been any issues in the past ~10 weeks (as expected).
Description This is just a personal gripe of mine with Wanikani, but I believe that the app shouldn't accept answers if a user accidentally mistyped something that was already provided in the question.
For example: 夏休み should accept なつやすみ and fail on なつやつみ, but なつやすむ (and everything else not ending in み) should give the user the chance to retry since they clearly just misspelled the last character that has been provided by Wanikani already.
For simplicity's sake, I didn't add a config setting for this behavior, but I can modify the PR if requested. If it's not something that you believe should be added to the app (since it's not a website feature), then that's perfectly fine, but I think it would be a positive addition that could help make reviews less frustrating.
Implementation My implementation makes use of Regex, but the speed shouldn't be an issue here — in fact, you could argue that the for loop just above this addition could be replaced by
currentAnswer.matches("(?:\\p{Script=Katakana}|\\p{Script=Hiragana})+")
as well for better readability. However, that's beyond the scope of this PR.We take the word that is being asked for (Kanji + Kana) and replace all Kanji greedily with
.+
. If we match the user's answer to this new regex string, we can effectively check if everything that's not part of the Kanji's reading is still present in the expected order. What the user replaced the Kanji with is not relevant to us as long as it's not nothing.Update (more info in the comments): Since the question can contain Katakana that users answer with Hiragana and vice versa, it is necessary to convert all Katakana to Hiragana first. Their ASCII codes are fortunately aligned, so this is trivial.
Drawbacks This change prevents users from submitting a pointless answer to fail the question if the vocabulary question contains Kana. However, since there already is a “Don't know” button to submit an empty answer, this shouldn't be a big deal.
Moreover, it might confuse users who are not familiar with this feature.