BLKSerene / Wordless

An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
GNU General Public License v3.0
686 stars 90 forks source link

Frequent crashes for some operations (N-gram, etc) #21

Closed yanqianglu closed 1 year ago

yanqianglu commented 2 years ago

Hi,

I have a few documents mixed in Tibetan and Chinese. I found Wordless would crash multiple times, especially for n-gram, collocation extractor.

I'm not sure if that's because of the size of the corpus. From the profiler, there are 10909 paragraphs, and 172885 tokens, 581964 characters. I remember I tried with small files, but the app crashed too.

I'm on macOS 12.3.1 with M1 Pro chip. I tried another Macbook with intel chip but had the same experience. Wordless version: 2.2.0.

The path to Wordless doesn't have any non-ASCII characters, though file names are in Tibetan and Chinese.

I do have the crash report that generated by the system, but not sure if that's helpful. Please let me know what other information are needed for investigation.

Thanks

BLKSerene commented 2 years ago

There is a bug for the macOS version which should be fixed in the next version. You may use the Windows version now.

BLKSerene commented 1 year ago

Fixed in 2.3.0, please take a try.