dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.26k stars 1.58k forks source link

String.last, first #9314

Open alan-knight opened 11 years ago

alan-knight commented 11 years ago

For the same reasons it's useful to have Iterable.last it would be nice to have String.last. And for symmetry, String.first.

lrhn commented 11 years ago

Should that return a string containing the last code point or the last rune?

If it is similar to Iterable/List, it should return string[string.length - 1] . That would be just the code-point, not the rune, so it's likely to not be very interesting. I'll assume you want the last rune as a string.

Anything based on runes should require you to go through the runes getter. You can do:

(string.runes..reset(string.length)..movePrevious()).currentAsString

for the last rune, but I can see that isn't very succinct. The first one is shorter:

(string.runes..moveNext()).currentAsString

Why is string.first and string.last important? If you just want to check if the string ends with something, it's better to do

string.endsWith(">")

than

string.last == ">"
floitschG commented 11 years ago

There is also new String.fromCharCode(string.runes.last)

alan-knight commented 11 years ago

Returning something based on the rune would be inconsistent with the index operation, which would be bad. I think it should just return string[string.length - 1] as you initially suggest. Which might only be part of a character, but you managed to convince me earlier that that was fine in most circumstances. And if it isn't, then you should use runes.

endsWith(">") is useful for the case that prompted me to file this, so that's handy. But I didn't know about it, so I suspect users may not always find it.

The other suggestions seem so verbose as to be entirely unusable, other than in the implementation of one's own helper functions to do these things.

In general, it would be useful to operate on strings like lists of characters. This can be done via string.split("") but that's rather obscure and probably even less efficient than most of the alternatives. So, for example, I've been working on numeric formatting code. This wants to parse a formatting specification, e.g. "##.##­0E00". This iterates through the string and has a bunch of case statement clauses         switch (ch) {           case _PATTERN_DIGIT: <stuff>           case _PATTERN_ZERO_DIGIT: <stuff>           case _PATTERN_GROUPING_SEPARATOR: <stuff> etc.

The original version maintains its own index and moves it around in the string. An iterator would be much more pleasant. I can get one via split("") or I can just write my own in a few lines, but this seems generally useful and omitting it will just mean there are N different buggy versions of it. I could iterate over code units or runes and define the constants as numbers rather than strings, but that would make the code even more painful to debug.

lrhn commented 11 years ago

It seems like a large and unnecessary overhead to create new one-char strings for each character, when its numerical value is equivalent. Ofcourse that is based on the understanding that strings are expensive (relatively) and small integers are cheap.

Why do you use strings of length 1 instead of code units? Is it easier to read/write? If so, would character code constants change that? (e.g., instead of (ch == "a") you would write something like (c == #"a")).

alan-knight commented 11 years ago

Character constants would make it shorter to write some kinds of code that worked with code units. It wouldn't help in my case, where what I'm comparing to are long symbolic constant names. And character constants don't help with debugging comprehensibility. It's much easier for me to see that it's looking for ';' and finding '0' than it is to see that it's looking for 59 and found 48.

It's clearly less efficient to allocate the additional strings, but it's a great deal more convenient to work with strings rather than integers. Are we planning to remove the large and unnecessary expense of the [] operator for strings?

If people don't have an iterator, it's not that they'll fall back to using integers. They'll fall back to using a for loop with an index. Or using split(""). Or writing their own iterator. So it's not a matter of whether people will incur the expense of creating strings. They will. It's a matter of whether we provide reasonable operations, or end with worse user code and multiple private implementations of these operations.