sillsdev / web-languageforge

Language Forge: Online Collaborative Dictionary Building on the Web and Phone.
https://languageforge.org
MIT License
45 stars 29 forks source link

Viewing project page throws a 500 #1773

Closed hahn-kev closed 1 year ago

hahn-kev commented 1 year ago

open this page and there's a 500 error https://languageforge.org/projects/smk-flex

rmunn commented 1 year ago

This project is actually a good example of why the "reset repo history" feature in lexbox is useful. The English reversal index is an XML file 8.3 megabytes in size, not all that large in absolute terms. But the project history has over 6,500 commits (most recent one being today), and so many of them touch the English reversal index that its Mercurial history file, .hg/store/data/_linguistics/_reversals/en/en.reversal.d, is 512 megabytes. When doing an hg clone, Mercurial spends an exceedingly long time maxed out at 100% CPU reconstructing that file's history. (Over 30 minutes so far, and it's estimating 52 minutes remaining at the moment). I suspect that running the history-reset process will greatly reduce the time it takes to clone the project.

Once I finally get the project cloned on my dev machine, I'll be able to run tests duplicating the timeout, and then I'll be able to verify whether upgrading Svelte-Kit will fix the timeout issue. But it looks like that will have to wait for tomorrow, because I won't be done with this hg clone before the end of the work day at this rate.

rmunn commented 1 year ago

Update: I copied the repository with rsync and then ran an hg clone that was purely local, no network involved so the only time involved in the clone was reconstructing the files from Mercurial history. It took 19 minutes and 15 seconds, adding 6567 changesets with 56114 changes to 279 files. That's... a lot. So I've created a project called test-rmunn-smk-short with Mercurial history truncated down to a single commit. I'm cloning that now into the live LF server and will test to see how the dashboard page behaves with the same number of dictionary entries but no Mercurial history.

rmunn commented 1 year ago

Answer appears to be that the shortened-history project still throws the 500 Internal Server Error on the dashboard page. Which means I'll be able to use the shortened-history project to repro the issue locally.

rmunn commented 1 year ago

I was able to get the shortened-history project to clone locally, and I saw something I hadn't been able to see when I tested this on the live server. On my dev machine, the PHP stack trace is visible, and the error is:

Allowed memory size of 268435456 bytes exhausted (tried to allocate 20480 bytes) in /var/www/html/Api/Model/Shared/Mapper/MongoMapper.php on line 122

That's happening when it calls MongoMapper->readListAsModels in the LexDbeDto::encode call on Sf.php line 521:

https://github.com/sillsdev/web-languageforge/blob/11f21e7c1006764adfec8ec0ab006aaea872753a/src/Api/Service/Sf.php#L521

So it's unrelated to the undici bug after all. It's because next-app is calling lex_stats, which returns all the entries. What next-app is actually trying to do is get a count of how many entries are in the project, and how many have audio or a picture:

https://github.com/sillsdev/web-languageforge/blob/11f21e7c1006764adfec8ec0ab006aaea872753a/next-app/src/routes/projects/[project_code]/meta/+server.ts#L31-L43

But the PHP DTO code is coded to just return the whole list of entries.

Solving this is going to involve rewriting the PHP lex_stats handler to do the counting (including the "how many pictures?" and "how many audio?") on the server, and just sending a count to the frontend instead of sending the whole entire entry list.

megahirt commented 1 year ago

The decision to get all entries for the purposes of counting client side was not a good or performant one. It was the cheapest way forward and now it looks like we need to re-write both the client and server side to adjust since we didn't really test with large enough datasets it seems.