Closed cocowash closed 1 year ago
can you share the card/deck. Because I also study german and haven't had that issue. Maybe the card is using some special type of question mark.
can you share the card/deck. Because I also study german and haven't had that issue. Maybe the card is using some special type of question mark.
Sure, you can find it at https://anonfiles.me/qaYl/s01e01.apkg I would bet it's because of the combination of a punctuation mark + a line break, but feel free to try it yourself
I figured out what the problem is. It dosen't have to do with line breaks. It has to do with this regex "\b[^\s\d]+\b" it fails to match when a non letter character is between two words it will match Das?Hat and the?locomotive . The solution I want to add to morphman would be to use \p{L} but re module in python dosen't support but the regex supports it. But I can't figure out how to import it
@cocowash a solution for your problem is go to the card browser press CTRL+ALT+F a find and replace dialogue will pop up. Type in the find section ? and in the replace section add ?
I figured out what the problem is. It dosen't have to do with line breaks. It has to do with this regex "\b[^\s\d]+\b" it fails to match when a non letter character is between two words it will match Das?Hat and the?locomotive . The solution I want to add to morphman would be to use \p{L} but re module in python dosen't support but the regex supports it. But I can't figure out how to import it
Now that I think about it. Since it's easily solvable by just using find and replace. Maybe we shouldn't change the regex since there are cases where a - is put between them
Type in the find section ? and in the replace section add ? (press the space key don't type )
That makes no sense. Use Markdown backticks .
Now that I think about it. Since it's easily solvable by just using find and replace. Maybe we shouldn't change the regex since there are cases where a - is put between them
Thanks for the support, sadly find and replace won't work on all cases. In Some cases the problem is solved, in others the word is added with the punctuation mark and the space.
I am going to talk to the maintener of the new addon if we can import the regex module.
BTW where did you find inuyasha in german. I find it hard to find 90s anime in german dub
I am going to talk to the maintener of the new addon if we can import the regex module. BTW where did you find inuyasha in german. I find it hard to find 90s anime in german dub
Thanks for the support, sure, If you provide an email or an account where I could send a private message I could comment a little bit more about Inuyasha.
mariothrowsfieball@gmail.com
@cocowash the problem has been fixed. To fix it go to your morphman folder open the file morphemizer.py and change this line "word.lower() for word in re.findall(r"\b[^\s\d]+\b", expression, re.UNICODE)" to "word.lower() for word in re.findall(r"\w+", expression, re.UNICODE)"
Thanks, I tried with the line replacement and it woks fine.
Describe the bug Punctuation marks are taken as part of the morph and agglutinate multiple words into one morph. It seems that it only happens on the last word of the sentence followed by a line break ie: Ich denke, er ist da unten. Und warum gehst du dann nicht runter? Weil es mir hier unheimlich ist.
Moprhs formed: unten.und, runter?weil
Expected behavior Punctuation marks Shouldn't agglutinate two different words into one morph. Moprhman shouldn't identify two words as one morph if a line break is used instead of a space.
Screenshots
Environment Anki version: 23.10 Morphman Qt 6 Alpha 4