Closed scottkleinman closed 9 years ago
What about Chinese and old English? Just to see whether that is caused by encoding
Old English worked fine (I did set the special characters rule set it to DOE SGML). I tried the Modern Chinese collection, and it worked the first time set to 1-grams by characters with proportional counts. I think I tried 1-grams by words next, and my browser hung. I next tried 1-grams by characters with raw counts. The same thing appeared to happen--I got the dreaded "Not responding" from Firefox. But I saw in the Windows Task Manager that the process was continuing to take up RAM, so I left it, and after a few minutes the table did render. So, at least for Chinese, encoding is not the issue--more likely the problem is the size of the corpus (the table had 174 pages). We'll have to see if there is any more optimisation that can be done to speed up the process. As for Middle English, I'll do some more testing. The main issue seems to be that from the user's point of view, the browser appears to hang (the animation in the blue spinner icon stops as well).
One other weird thing is that in the raw counts, characters like 东 can occur "144.0" times. I'm not sure what the decimal point is for since we're never going find 东 occurring 144.5 times! :-)
I don't think floating point are that serious though. We have lot of known issues that we need to trap.
I have been able to recreate browser hang. The back-end only takes 1 second to reply to the post, and the browser receive the reply for half a second. All the rest are spent on applying js and css, I think we can solve this by only show 25 rows in the first page.
@akuisara
Did we apply the Scroller extension in this table? The 50,000 row client-side example might help and the 500,000 row server-side example would be even better. This is assuming that the js and css Moses detects relates to generating the table.
We did not apply the Scoller. I have tested the DataTable before, the results turned to be the same with and without Scoller being implemented. Besides that, client-side and server-side processing took the same time as well.
OK, let's hope that the pagination solves this problem. I have made the fix, and it will go out with my next push.
I uploaded the four Middle English texts (so not a huge amount of data), and it took quite a long time to render the table. When I tried doing Raw Counts instead of Proportional Counts, the browser hung and I had to kill the process. Is anyone else encountering such problems?