gohugoio / hugo

The world’s fastest framework for building websites.
https://gohugo.io
Apache License 2.0
74.73k stars 7.45k forks source link

Sorting strings gives surprising behavior when some of them are convertible to numerical #10685

Open AlexisDerumigny opened 1 year ago

AlexisDerumigny commented 1 year ago

What version of Hugo are you using (hugo version)?

$ hugo version
hugo v0.110.0-e32a493b7826d02763c3b79623952e625402b168+extended windows/amd64 BuildDate=2023-01-17T12:16:09Z VendorInfo=gohugoio

Does this issue reproduce with the latest release?

Yes

Description

This issue looks like #10389 but is slightly different.

When sorting a slice of strings of which some are convertible to numerical, the ones that can be converted are converted to floating point and then sorted at the end.

For example, this code

{{ slice "1A2" "1A12" "1F2" "aaa" "bbbb" "1" "1E2" "1E12" | sort }}

produces the output:

[1A12 1A2 1F2 aaa bbbb 1 1E2 1E12]

(since 1E2 is apparently converted to 100, and similarly for 1E12). I guess that the values 1A12 1A2 1F2 aaa bbbb are sorted as strings and 1 1E2 1E12 are sorted as floating points.

Is this intended behavior or a bug?

I was quite surprised to discover this, as it was not documented in https://gohugo.io/functions/sort/ that strings are treated differently depending on whether they can be converted to floating points. I would be happy to contribute to a PR to the documentation if this is not a bug.

By the way, as a workaround, I obtained the usual sorting on strings [1 1A12 1A2 1E12 1E2 1F2 aaa bbbb] by appending a whitespace to each string before sorting them.

khayyamsaleem commented 1 year ago

The workaround is clever! Basically, everything coming into the sort function has to be parsed and interpreted on a best-effort basis. Infinity and NaN were accommodated as edge cases in #10389, but I could imagine that some sites in the wild might depend on hugo to accept scientific notation here for sorting slices of numbers.

khayyamsaleem commented 1 year ago

If we did want to fix this, we would add it as another case to parse as string like done here in #10389. But if not, something more involved might be required to unify the type of the slice being sorted before yielding the result.