Closed fenhl closed 6 years ago
There's different normalization rules for foreign names (where CJK character search should work as expected), and different for everything else (where normalization is aggressive, because there are no CJK characters on any Magic cards).
I'm surprised Mu Yanling seems to break the rules, and I sort of wonder if Gatherer won't correct the issue by giving official English spelling anyway.
https://mtg.wtf/artist/_ is a separate issue which indexer should fix. I vaguely recall reporting it as mtgjson bug ages ago.
Fixed "??? drew 5 cards." issue at least. Let's wait for Mu Yanling before we address it, as right now nothing in database cares either way. I'll probably remove CJK stripping just as you propose.
I speculatively did this https://github.com/taw/magic-search-engine/commit/6035924da1179c057548c72e0b727fa5e621ab14
Does it fix the problem?
You'll want to update the regex in setup_artists!
as well.
Does it work now?
It does!
Nice.
My fork has an issue (fenhl/lore-seeker#1) with a card illustrated by 酩憲: it is grouped together with the 5 cards for which MTG JSON has no artist info. Mu Yanling's artist's name is also written in Hanzi and will have the same issue. Of course, this would be less of a problem if MTG JSON had complete artist info, but even then I still think it would be appropriate to not “normalize” CJK characters into a single underscore.