openedx / edx-platform

The Open edX LMS & Studio, powering education sites around the world!
https://openedx.org
GNU Affero General Public License v3.0
7.34k stars 3.84k forks source link

Examine Django-countries[pyuca] Memory Usage and lexical sorting #35220

Open ormsbee opened 1 month ago

ormsbee commented 1 month ago

About 4-5% of LMS memory usage comes from pyuca, which as far as I can tell is used in our system to give django-countries a better sort order (it's an optional dependency of that package). This is because Unicode sorting is quite complex, involving multiple layers of lookup tables and glyph translations to handle many language-specific quirks.

How different is the sort order without pyuca installed, and how much does it matter? It must have some level of importance for them to bundle it with django-countries the way they did, but I'm not clear on how critical it is.

(Making a ticket for this for later followup since I don't have time to dig into it now.)

ormsbee commented 1 month ago

Okay, this was just bugging me this weekend, so I did a quick script to see what the difference in ordering actually is for the django-countries languages we support in edx-platform:

with-pyuca.zip without-pyuca.zip

While we could bundle pre-created lists of countries by language, it's also possible to do this sorting all on the client side now, since all major browsers support locale-aware sorting in JavaScript–and since we're moving to MFEs for everything, that's probably our best bet in the longer term anyhow.