Open johnrs opened 9 years ago
I think the tagged change solves the problem(s). However, replacing
return s[i] < s[j]
With:
return len(s[i]) < len(s[j])
I'm not sure in what cases that code is hit and how that code change would affect the result
I think the tagged change solves the problem(s).
I am seeing just one problem. [a1 a#1 a_1 aa] ==> [a1 a#1 a_1 aa] // Correct But [1 #1 _1 a] ==> [#1 1 _1 a] // Wrong The second case is the same as the first, minus the leading "a".
John
John Souvestre - New Orleans LA
Ah oops i think i punched that test case in incorrectly. Reopening to track
Alright. Reviewed the problem. I now set all non numeric symbols to be greater than their numeric friends.
Unfortunately, it seems that the fix had a side effect.
["1", " ", "0"] è [1 0] - The space is in the middle.
The problem is sensitive to the input order. It fails for space, but other characters seem to work.
John
John Souvestre - New Orleans LA
From: Saruhan Karademir [mailto:notifications@github.com] Sent: 2015 April 25, Sat 18:58 To: skarademir/naturalsort Cc: JohnRS Subject: Re: [naturalsort] 2 more cases which don't seem to sort correctly (#10)
Alright. Reviewed the problem. I now set all non numeric symbols to be greater than their numeric friends.
— Reply to this email directly or view it on GitHub https://github.com/skarademir/naturalsort/issues/10#issuecomment-96296413 . https://github.com/notifications/beacon/AFaoatfEw4Ax75xZo2h6YGxYflqw5TTwks5oDCGEgaJpZM4EDVQ3.gif
Solved that last problem you found as well. Looks like the bottom-most equality clause was being hit when a string of only space characters was being compared. This meant I had to remove the len(left) < len(right) optimization you had suggested earlier.
Thanks again!
I think that I'm still seeing a problem with a space-only string. It seems to sort first. For example:
["1", " ", "#"] results in [ 1 #] rather than [1 #]
John
If I understood you correctly, you want the space character to be handled like all other non numerical characters. However, since we are explicitly filtering This character out, it takes a different precedence. This behavior matches MacOS X Finder sorting.
I think the change would not be too hard, but I'm reluctant to break away from established norms. Do you have any examples of the space character being deffered in other natural sorting implementations.
It seems that the space character is sorted ahead of the numbers, but all of the other non-numeric characters are sorted after the numbers. This seems illogical to me. I believe that the space character should be treated like all of the other non-numeric characters. Also, please note that currently a "space" sorts before a number, but a "space, letter" sorts after a number
I don't know of any references for natural sorting. As an example, vbom.ml/util/sortorder sorts the way I describe.
Here is a sample which shows a few variations on the theme. Input: [" 0", "1", "2", " ", " b", "#", "", "a"] Result: [ 0 1 2 # a b] aka [20 2030 31 32 23 5F 61 2062] I Suggest: [1 2 0 b # _ a] aka [31 32 20 2030 2062 23 5F 61]
You got me with
Also, please note that currently a "space" sorts before a number, but a "space, letter" sorts after a number
Reopening
[]string{"1", "#1", "_1", "a"} ==> [#1 1 _1 a]
[]string{"111111111111111111112", "111111111111111111113", "1111111111111111111120"} ==> [111111111111111111112 1111111111111111111120 111111111111111111113]
Notes: You can replace regexp.MustCompilePOSIX with regexp.MustCompile. From what I understand, not many use the POSIX version.
You can replace: splitshortest := len(spliti) if len(spliti) > len(splitj) { splitshortest = len(splitj) } for index := 0; index < splitshortest; index++ { WIth: for index := 0; index < len(spliti) && index < len(splitj); index ++ {
You can replace: if ei == nil && ej == nil { //if number With: // Handle numbers case.
if ei == nil { // Only need to test ei since ej is the same
Before: return spliti[index] < splitj[index] I added: // Handle non-numbers case
You can replace the code: return s[i] < s[j] With: return len(s[i]) < len(s[j])
One of these changes solve one of the problems above.