Open SnortsAlot opened 3 years ago
Right, it sorts names using ngx_strcmp
(a wrapper around the standard strcmp
), just like nginx's built-in autoindex module does. It's fairly common for web servers (and UNIX systems generally) to sort directory indexes in this manner.
To perform a case-insensitive sort like you suggest, it would have to use strcasecmp
. Sorting which is aware of non-ASCII characters is more complex. Sorting rules (collations) can even change depending on the locale. That would mean that a locale-aware sort function like strcoll
would need to be used, and it would need to be possible to configure the server as to what locale to use.
I don't see any wrappers around strcasecmp
or strcoll
in nginx's utility API documentation. Perhaps you could suggest to the developers of nginx that they add the ability to do case-insensitive sorting and/or locale-aware sorting to their autoindex module. Perhaps in the process of adding that feature, they will add ngx_strcasecmp
and ngx_strcoll
functions which ngx_fancyindex could then use to implement the same feature.
Locale-aware sorting has been mentioned previously in #60.
One potential can of worms of using locale-aware collation is that we don't know which locale should be used:
C
or C.UTF-8
, which would still result in letters with diacriticals sorted out in in unexpected way for some people.Using plain ASCII sorting (what strcmp
and ngx_strcmp
do) is the only reasonable option. I think we can consider adding case-insensitive sorting, though. Maybe even switching to case-insensitive sorting by default 🤔
I could see a use case where someone runs a server serving files whose names are primarily in one language and wanting the sort order to reflect that language. Think about an internal server at a small company serving files only for the employees of that company.
Allowing the web site visitor to influence the locale of the sort order is probably beyond the scope of what a web server module could be expected to do. Allowing the returned content to vary based on a header is bad for caching too.
Allowing the server administrator to select case-insensitive sorting (#78, #124) is great, but again it's probably out of scope to allow the web site visitor to select that, and to remain consistent with what Apache and nginx server administrators expect I would recommend keeping case-sensitive as the default.
Returning the results from directory listing.
I'm guessing the ascii values are being checked for that so while a listing in more traditional alphabetical ordering would look like
the index returns
apparently prioritizing uppercase in resulting listings.
if this is intended, please turn this into a feature request to allow returned results with effectively a .lower() scenario and... then how other language's special characters are handled in that ordering,
I've not done extensive testing, but it appears non native "English" characters are displayed/ordered after z in most cases.
ie.. a,b,c... z Then all other characters or diacritical markings (umlaut, cedilla, accute accent, crucflex, tilde, grave, etc)
à, è, ì, ò, ù - À, È, Ì, Ò, Ù á, é, í, ó, ú, ý - Á, É, Í, Ó, Ú, Ý ą,ł, ż, ß, ä, ö, ü, ç, ã, õ,
versus the more expected