scala / bug

Scala 2 bug reports only. Please, no questions — proper bug reports only.
https://scala-lang.org
232 stars 21 forks source link

scaladoc sorts in ASCII order #12149

Open martijnhoekstra opened 4 years ago

martijnhoekstra commented 4 years ago

reproduction steps

https://www.scala-lang.org/api/current/scala/collection/mutable/Stack.html sorts top after toVector as reported on https://contributors.scala-lang.org/t/scaladoc-ordering/4503

problem

This is sorted in ASCII order rather than alphabetical. I suspect a quick fix is possible at in https://github.com/scala/scala/blob/26dd17aebc988ba84c243e6f23796680df0d4a26/src/scaladoc/scala/tools/nsc/doc/model/Entity.scala#L79 with a toLowerCase

nafg commented 4 years ago

Can someone explain to me why case-insensitive sort is better? Strings are almost always sorted this way.

SethTisue commented 4 years ago

fwiw I tried javadoc on:

package foo;

public interface J {
  void toList();
  void toVector();
  void top();
}

and the output has the case-insensitive sort order:

Modifier and Type Method and Description
void toList() 
void top() 
void toVector() 
martijnhoekstra commented 4 years ago

I replied with the "I'm confused now" confused smiley, not the "I don't think this is a good idea but I'm not about to write that out in words" confused smiley. It can be difficult to spot the difference.

NthPortal commented 4 years ago

personally, I like toList and toVector being grouped, as they're semantically closer? and my intuition says that that holds for methods in general, but I can't actually prove that

dwijnand commented 4 years ago

I think it's good practice to make human-facing text alphabetical rather than, what I've found out is called, "ASCIIbetical" order. But when the human is a developer and the case is significant, then ASCIIbetical might be the better choice.

martijnhoekstra commented 4 years ago

If toList and toVector should be grouped, shouldn't we just use a @group for it?

NthPortal commented 4 years ago

I just mean that semantically, toList and toVector are closer than either is to top

eed3si9n commented 4 years ago
sortWith
sorted

might be an interesting case to ponder.

Also supposed we have:

sortForList
sortForVector
sortForall
sortForeach
sortWith
sorted

ASCII ordering + camel casing creates an interesting effect as if the methods are sorted by camel-cased words, but it could require multiple scanning of A-Z, a-z trying to look for sortForeach if you can't remember the exact casing (after all you're in the Scaladoc).

Case insensitive sorting would do:

sorted
sortForall
sortForeach
sortForList
sortForVector
sortWith
NthPortal commented 4 years ago

but it could require multiple scanning of A-Z, a-z trying to look for [...] if you can't remember the exact casing

that's a very good point, and I think a more compelling one

nafg commented 4 years ago

Maybe instead of sorting by the name as a single word, it should sort by words, case-insensitive. Something like:

def symbolToWords(sym: String): Seq[String] = ... symbols.sortBy(symbolToWords)(Ordering.Iterable(Ordering.comparatorToOrdering(String.CASE_INSENSITIVE_ORDER)))

On Tue, Sep 15, 2020 at 4:53 PM Princess | April notifications@github.com wrote:

but it could require multiple scanning of A-Z, a-z trying to look for [...] if you can't remember the exact casing

that's a very good point, and I think a more compelling one

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/scala/bug/issues/12149#issuecomment-692972837, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAYAUDFPYIW5QUPEOXT2TTSF7H3DANCNFSM4RFHKUMA .

dwijnand commented 4 years ago

but it could require multiple scanning of A-Z, a-z trying to look for [...] if you can't remember the exact casing

that's a very good point, and I think a more compelling one

I agree it makes a compelling argument but

multiple scanning of A-Z, a-z trying to look for sortForeach if you can't remember the exact casing

implies (to me) a lack of naming convention or at least a lack in consistency. When a convention is applied consistently, as in the top/toList/toVector case, the ascii order presents better results.

So I think I still favour ASCIIbetical ordering, as if you're looking specifically for "sortforeach" the search functionality should be case-insensitive and find you sortForeach.

som-snytt commented 7 months ago

MLA citation style says alpha order letter by letter, but I had a notion that

sortBy
sortWith
sorted

is a desired ordering because sort precedes sorted (not that B precedes e).

If there were a method sortaImmutable, it would fall between sortX and sorted.

Similarly for grouping all toX before tokenize, or what have you.

I don't seem to have a book that demonstrates citations with shared prefixes, but this hymnal sorts everything with "O" ("O Beautiful for Spacious Skies") before everything starting "Of" or "On", and so on. (But it is not citation order, which drops initial articles, such as in "The First Nowell".) The must be a word for "sort by complete word prefix".

Just noticed the sort order

scala
scalaDoc
scalac