BLKSerene / Wordless

An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
GNU General Public License v3.0
686 stars 90 forks source link

unexpected breakdown #8

Closed Ciakhuen closed 5 years ago

Ciakhuen commented 5 years ago

each time when I finish import or almost finish, the software will breakdown, plz check!

BLKSerene commented 5 years ago

Hi, what version of Wordless are you using? How do you reproduce the problem? And could be please send me a sample text file by email?

Ciakhuen commented 5 years ago

Hi,

Thanks for your reply. I am using the latest vision you post on the Github and downloaded via BaiduNetdisk. It should be vision 1.1.0 I guess. Btw, this problem also happened when I first use this tool vision 1.0.0. Every time I open the folder it will breakdown when importing. The system I am using is MacOS 10.14.3 and the data we use in this research is about 25,000 texts encoded with UTF8. I tried over ten times and there are only two times the software successfully imported all the data, but an unexpected problem happed when I run the function of “general”, for exploring the general situation of these texts including types, tokens, and TTR, etc.. The interface shows a remind: two files are not encoded with UTF8 and will not be included. Then the software totally breakdown and I cannot do anything but force close it. The attachment is part of the data I used.

Wordless is such an integrated software for our corpus research and we are so excited when I first found it from my friend’s message. We will be grateful if you can help us overcome the problem and make it useful in our following researches. Also, I am happy to help you and your team make the great software better and best! Thanks and I am looking forward to your help !!

Jiekun Linguistic Lab, Northeastern University 2019年4月26日,下午6:19,BLKSerene notifications@github.com 写道:

Hi, what version of Wordless are you using? How do you reproduce the problem? And could be please send me a sample text file by email?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/BLKSerene/Wordless/issues/8#issuecomment-487007805, or mute the thread https://github.com/notifications/unsubscribe-auth/AL5IK3TK3I73EUOI2ROOMCDPSLJJ5ANCNFSM4HIJ52PQ.

BLKSerene commented 5 years ago

“Open folders” should have already been fixed in 1.1.0, you may check the version number in "Help" -> "About Wordless".

If there's an error while processing one of the files that you have imported, the whole process will be aborted and no results will be shown. If two of your files have encoding problems, you may change their encodings using other professional text editors like Sublime Text or Emeditor first. If you know the correct encodings but Wordless did not detect the correct one, you can change it in the File Table manually. In your case, you have a large amount of texts, you may also try adjusing "Settings -> Auto-detection -> Detection Settings" to achieve more accurate results for the detection of file encodings (but it may take more time for import).

And I can't see the attachment, could you please check it again?

Ciakhuen commented 5 years ago

I've checked my vision via "check update", the result shows I am using the latest vision of 1.1.0. And I retired to import the folder, or all files while unchoose all auto-detection options and it still takes serval minutes to deal with. Could you kindly tell me how long it usually takes when processing 5,000 short texts with 300 English words each? And the attachment is again attached.

This screenshot shows a breakdown situation after successfully import 6173 files.

Thanks for your reply and I am still looking forward to hearing from you.

BLKSerene commented 5 years ago

I haven't test the case of a very large number of text files, but it shouldn't take too long if you disable all auto-detection settings since the analysis of the text and the number crunching work won't happen at the time of import. (You may try importing 50 or 100 files to see how long that would take and estimate the total time needed.) I'll need some time to optimize the speed when importing files. For now, if the speed is unbearably slow for you, you could consider merging the text files.

I could not see either the screenshot or the attached file, or you may try replying to me on PC (rather than by mail?)

Ciakhuen commented 5 years ago

May I contact with you via other ways, if it’s ok?

在 2019年4月28日,下午5:11,BLKSerene notifications@github.com 写道:

I haven't test the case of a very large number of text files, but it shouldn't take too long if you disable all auto-detection settings since the analysis and number crunching work of the text won't happen at the time of import. (You may try importing 50 or 100 files to see how long that would take and estimate the total time needed.) I'll need some time to optimize the speed when importing files. For now, if the speed is unbearably slow for you, you could consider merging the text files.

I could not see either the screenshot or the attached file, or you may try replying to me on PC (rather than by mail?)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/BLKSerene/Wordless/issues/8#issuecomment-487360743, or mute the thread https://github.com/notifications/unsubscribe-auth/AL5IK3RM76DQ672FGLEU2CLPSVS55ANCNFSM4HIJ52PQ.

BLKSerene commented 5 years ago

Email is okay (you can find my email on the homepage on Github or through "Help -> Need Help?" in Wordless).