microsoft / vscode

Visual Studio Code
https://code.visualstudio.com
MIT License
162.21k stars 28.55k forks source link

editor.action.sortLinesAscending has confusing sort order for symbols #48123

Open ayrtonmassey opened 6 years ago

ayrtonmassey commented 6 years ago

When sorting lines using editor.action.sortLinesAscending ("Sort Lines Ascending" via the Command Palette) the lines are sorted in a very strange order when the comparison involves a symbol character.

I would expect that . (period) would be sorted before _ (underscore) because in ASCII, period is 46 and underscore is 95.

However, vscode uses localeCompare as its sorting method (see sortLinesCommand.ts:76), which results in underscore being sorted before period. See the repro steps below for an example.

I saw a previous issue regarding this (#15516) but it was closed because the ASCII ordering was in fact correct. In the case I've outlined below, the ordering is not correct.

I'm not sure what the correct solution would be but I think that for ASCII symbols, ASCII ordering should be obeyed. I do realise that "Sort Lines Ascending" is an ambiguous term - ascending according to what criteria? - so perhaps the command could be renamed to something more specific, or you could provide different default sorting options.

Steps to Reproduce:

  1. Copy the following into a new file:
a_b.txt
a_b_c.txt
  1. Highlight the file contents.

  2. Open the Command Palette and select "Sort Lines Ascending".

Expected:

The file is sorted in the following order:

a_b.txt
a_b_c.txt

Actual:

The file is sorted in the following order:

a_b_c.txt
a_b.txt

Does this issue occur when all extensions are disabled?: Yes

alexdima commented 6 years ago

I don't have a strong opinion either way. However, the current usage of localeCompare is consistent to our using localeCompare in other places, like e.g. in the explorer:

image

ayrtonmassey commented 6 years ago

Thanks for your response - interesting that the editor uses the same sort as the UI!

I use the sort function most often to sort lines in code, which usually contain only ASCII characters. The issue I (personally) have with the current vscode sort function is that when I pipe the file content into other utilities, those utilities expect lines to be in ASCII order.

Regarding the sorting of files in vscode's explorer, Windows explorer sorts the files as I would expect:

image

Sorting in ASCII order is beneficial as shorter filenames are placed before longer ones.

Since this is supposed to be a general sort & is also applied to UI elements, it's difficult to come up with a one-size-fits-all solution. Personally, I would introduce two settings:

Perhaps these settings could be separate from the sorting in the explorer, e.g. editor.sorting.method vs. explorer.sorting.method.

benkimball commented 6 years ago

I'm currently sorting a very long YAML file in order for it to pass yamllint configured to enforce lexical sorting of keys. This issue is of interest to me because currently Code's idea of a lexical sort and yamllint's idea differ in the area of underscores:

After Code editor.action.sortLinesAscending:

---
underscore_underscore: second
underscore: first

yamllint output:

3:1       error    wrong ordering of key "underscore" in mapping  (key-ordering)
sathya-beeline commented 6 years ago

Yeah, I've run across this issue with the sort method as well.

One problem is that if you use unicode integer representations to sort, it will always be confusing. And you sort of have to use unicode to sort, since that's what we use in our editors nowadays.

For example: ' is greater than - according to VSCode.

So, the following would be in VSCode sort order:

gem 'graphql-example'
gem 'graphql'

And so would the following:

graphql
graphql-example

I believe this is because it uses the SMALL HYPHEN-MINUS character, and compares that to one of the single apostrophes, which have a higher integer representation.

Citation: https://www.ssec.wisc.edu/~tomw/java/unicode.html

hamstergene commented 5 years ago

I find it wrong that despite 99.999% of files in the world of coding use only ASCII in names and content, VSCode defaults to sorting in an unexpected way just so 0.001% of unicode cases could be satisfied. Unicode collation is for natural languages but programmer's editor/IDE more often than not deals with computer languages so is defaulting to localeCompare really justified?

I wonder if there can be some compromise sorting option that satisfies both needs. Similar to how natural order sorting puts file10 after file2 by splitting the string into spans of numbers and non-numbers. Or how Windows Explorer supports Unicode file names and yet sorts ASCII names in a usual way.

cospin commented 4 years ago

For someone looking to fix this in the Explorer, this config can help in many cases: "explorer.sortOrder": "type"

rcdailey commented 3 years ago

Three years and 17 upvotes later and not a peep from the developers. I get that there's a lot going on, but this can't really be that difficult to address, right?

hediet commented 2 years ago

I think this issue has two aspects: 1) How can lines be sorted in ASCII order? 2) What should be the default?

(1) can easily be implemented by an extension that offers some kind of "Sort Lines ASCII" command.

As of 2), changing the default can be risky if some users got used to the current default. We would need to figure out if more users want ASCII or localeCompare order. If the pros outweight the cons, we could just change it for insiders and see how many users complain.

jensbodal commented 2 years ago

Interesting that a similar issue was closed 6 years ago since this is "as designed": https://github.com/microsoft/vscode/issues/15516

They used this example

['margin: 0;', 'margin-bottom: auto;'].sort();
// results in
["margin-bottom: auto;", "margin: 0;"]

However

['react-foo', 'react', 'apple'].sort()
// results in
['apple', 'react', 'react-foo']

// vscode sort lines ascending results in
[
  'apple', 'react-foo', 'react', 
]
brewster1134 commented 1 year ago

similar issues... with what i write as correctly sorted...

A: 'a',
A_B: 'ab'

when it is auto sorted, it switches the order

A_B: 'ab'
A: 'a',

i want to keep the shorter version first, with the following with additional granularity after, im sure this has been discussed before, but was unable to find anything. i have played around with various eslintrc settings, but no luck

i get Expected object keys to be in natural insensitive ascending order error

alex-ander-is commented 1 year ago

Looks like such big corporation does not care about, what ants want.

sshishov commented 1 year ago

Can we have at least the ability to override the sorting in the settings? It would be solve the issue almost for 99% of people.

I am pretty sure a lot of people using built in sorting abilities of IDE and this issue is frustrating. We also have this issue regarding dot (.) and underscore (_)

piotrekwitkowski commented 1 year ago

Can I ask for the same for dot (.) and hyphen (-)?

Sorting like

- parent-dir
  - item.ext
  - item-detail.ext

would be great, there is no option to achieve this today.

jimeh commented 11 months ago

How about special characters vs alphabetic characters (lower and upper)?

For example:

- parent-dir
  - installation
  - install_name
  - INSTALLATION
  - INSTALL_NAME
roshal commented 2 months ago

editor.action.sortLines has frustrating behavior

Z
z
A
a
a
A
z
Z
roshal commented 2 months ago

sort from gnu coreutils has encouraging behavior

sort <<< '
Z
z
A
a
'
A
Z
a
z
roshal commented 2 months ago

97272 added sorting option for explorer

"explorer.sortOrderLexicographicOptions": "unicode",

it would be nice to have a similar option for editor

"editor.sortOrderLexicographicOptions": "unicode",

some option should be applied in following location

https://github.com/microsoft/vscode/blob/0656d21d11910e1b241d7d6c52761d89c0ba23a4/src/vs/editor/contrib/linesOperations/browser/sortLinesCommand.ts#L87