FooSoft / zero-epwing

Sane data exporter for an insane dictionary format.
https://foosoft.net/projects/zero-epwing/
MIT License
99 stars 17 forks source link

An entry from Kenkyusha WAEI 5 is cut, maybe related to the entry size #2

Closed nonodesu closed 6 years ago

nonodesu commented 6 years ago

The dictionary in question: 研究社 新和英大辞典 第5版 The entry: する6 (do; perform; ...) When it's parsed by zero-epwing it gets cut at することなすこと皆うまく行かなかった. Everything I did went wrong. | I fail while the entry in EBWin continues

することなすこと皆うまく行かなかった. Everything I did went wrong. | I failed in every attempt.
…にすれば, …にしたら, …にしてみれば, …にしたって 〔…にとっては〕 from one's point of view; as far as one is concerned; to…; for….
君にすればただのペンかもしれないが, 僕には貴重なものなんだ. 返してくれよ. It may be just a pen to you, but to me it's a really precious thing. I want you to give it back.
・タヌキにしたっていい迷惑だよな. そこは自分たちのすみ家だったんだから. I'm afraid we've caused the raccoon dogs a lot of trouble. After all, it was their home.

It manifests in the entry being cut when imported using yomichan import. Also I know someone who is making a Discord bot using zero-epwing and has the same issue, and that's where we've actually discovered it.

I assume zero-epwing just can't handle entries this big, but maybe there's some other issue.

I don't know if there are more entries like this, this is just an example where zero-epwing breaks.

nonodesu commented 6 years ago

More info:

oh I think I managed to find a way to detect this error
in the resulting json, normal entries end with \n where anomalies end with something else
and I've found tons of other examples
上がる, 上げる, 味, ...
FooSoft commented 6 years ago

I see what this is... I was making the assumption that 10KB would be enough for any definition... the glossary for する proved me wrong. Should be a simple fix.

FooSoft commented 6 years ago

Binaries have been updated on foosoft.net

nonodesu commented 6 years ago

I've just tested and yomichan import still includes old binaries of zero-epwing. (But if I manually replace the binaries it works fine.)

FooSoft commented 6 years ago

Ah yeah, I need to update yomichan-import bundled version of zero-epwing!