golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.99k stars 17.54k forks source link

proposal: spec: remove string(int) #3939

Open robpike opened 12 years ago

robpike commented 12 years ago
The conversion of an integer to a string was put in very early to bootstrap formatted
printing, if I remember correctly. It's odd, though, and no longer necessary, if it ever
was. It also causes inconsistencies, since string(0xD800) cannot produce the UTF-8
encoding for that code point (by definition, surrogates are not legal in UTF-8) so must
produce something else. We chose the "\uFFFD" since that's the only reasonable
option, but that means:

1) the result isn't obvious
2) string(0xD800) and "\uD800" do different things: the former produces the
UTF-8 for U+FFFD while the latter is statically rejected.

I propose that, in some remote future, we eliminate this conversion from the language.
adg commented 11 years ago

Comment 1:

I guess we're bound by the Go 1 compatibility promise to keep this until Go 2.
rsc commented 11 years ago

Comment 2:

Labels changed: added go2.

rsc commented 10 years ago

Comment 3:

Labels changed: added repo-main.

rsc commented 10 years ago

Comment 4:

Adding Release=None to all Priority=Someday bugs.

Labels changed: added release-none.

dolmen commented 7 years ago

I've been beaten by this "feature" in production code today. My code was equivalent to this (but of course not so obvious: type, const and cast were in 3 different packages, and the cast result was an argument to a function, so that made the bad use of the cast quite hidden):

   type XType int16
   const x XType = 0x0001
   var y = string(x)

I think that go vet should raise an alert at least for my case (conversion to string of a value of a named integer type).

robpike commented 7 years ago

It is valid code, so vet will not complain about it. Perhaps golint should, but not vet.

urandom commented 6 years ago

Would it be possible to make string(int), return the string representation of the int. More users will probably expect this behaviour:

a := 42
fmt.Print(string(a)) // "42"

EDIT: got it, bad idea

dominikh commented 6 years ago

@urandom I would not expect an innocuously looking conversion to invoke an expensive number formatting routine. The solution to fixing a (supposedly) confusing construct isn't to replace it with another, equally confusing one.

rsc commented 6 years ago

No, that is not possible. It will silently break existing programs in a subtle way, for no real reason. If you want to convert an int (or a float or a bool) to a string, use package strconv. In particular, using strconv allows you more control over the exact formatting.

griesemer commented 6 years ago

Another argument in favor of abandoning this feature is that it behaves differently when shifts are involved. See #26096 .

kindlyfire commented 6 years ago

I wonder how to achieve what string(int) does with strconv ?

Doing string(65) returns "A", but I can't find a function in strconv that does that.

cznic commented 6 years ago

fmt.Sprintf("%c", 65)

https://play.golang.org/p/xFVd63hT4JA

griesemer commented 6 years ago

string(int) is really quite a complex operation as it needs to know utf8 encoding. @robpike 's original comment in this issue explains the rationale for removing it. Now that we have the proper libraries (uniccode, utf8) we can write this code explicitly. See e.g. runeToString here: https://play.golang.org/p/ZnUF0Oc_dAG .

magical commented 6 years ago

@griesemer You don't even have to use a library. This works: string([]rune{65}) https://play.golang.org/p/I6aX3c6fG71. Unless we're proposing removing string([]rune) as well.

griesemer commented 6 years ago

@magical Good point! I don't know about removing string([]rune). String(int/rune) is particularly annoying because the argument may be shift operation which opens a can of worms (see #26096 and various related issues).

mimoo commented 5 years ago

Note that python allows you to do str(42) -> "42" so this might bring confusion also.

ianlancetaylor commented 5 years ago

I noticed that some existing code uses string(os.PathSeparator), which would become invalid under this proposal. Not a reason not to do it, just a note about consequences.

networkimprov commented 5 years ago

This would break code that uses character constants as switches/flags/arguments, which a) can be included in a string (e.g. filename, URL), and b) are meaningful when printed in a log.

griesemer commented 5 years ago

@networkimprov I'm not sure I follow your comment. Can you be a bit more specific (example)?

One place where string(x) cannot simply be replaced with a corresponding library call is where x is a constant and we rely on the fact that string(x) also is a constant:

const s = string(c)

This won't be possible anymore. In many cases, const can be replaced with var; in others it may be necessary to do the string(c) manually rather than have it done by the compiler.

networkimprov commented 5 years ago

I use string(char) like this:

const ( eThis = 'I'; eThat = 'A' ) // alternative to iota
...
aFileName := aDir +"/"+ string(eThis) + aName
...
func F(iFlag byte, iData T) {
   log.Println("F with ", string(iFlag))

   switch iFlag {
   case eThis: ...
   case eThat: ...
   }
}
rsc commented 5 years ago

In all the discussion of this proposal I'd been focusing in my head on

i := 3
s := string(i)

which is clearly wrong and not going to hurt to remove. But I hadn't thought much about

c := 'x'
s := string(c)

which is clearly fine and does happen and will hurt more to remove. Worse, it happens in constants like "abc"+string(filepath.Separator).

30614 proposes adding filepath.SeparatorString, but having to update all that code seems unfortunate.

We could make this issue about rejecting string(int) but not string(rune), but today string(rune) and string(int32) are indistinguishable, so string(int32) would slip in too.

Or perhaps we should put this issue (#3939) off until we decide #29012 (separate rune and int32).

zigo101 commented 5 years ago

Is it ok to add a builtin char function to return one-rune string?

griesemer commented 5 years ago

I suppose one could add a new built-in, but char seems an unfortunate name for a function returning a string.

ianlancetaylor commented 5 years ago

I don't see any meaningful difference between a char builtin and the current string conversion.

zigo101 commented 5 years ago

It has the benefit of avoid mistaking stirng(aInt) as the string representation of aInt, but not more others.

How about allow concentrating constant strings and runes?

rooklift commented 5 years ago

If you'll accept a piece of anecdata, I had in my code a bit of string generation morally equivalent to this:

Go Playground:

foo := "字符串"
bar := ""

for i := 0; i < len(foo); i++ {
    bar += string(foo[i])
}

I had been treating string(n) as a way to convert bytes to strings of length 1 -- which works perfectly well with ASCII -- not realising it would generate something bigger if the byte was > 127. My silly mistake of course, but a real example of this biting someone...

qrpnxz commented 5 years ago

It's not totally clear to me if this proposal also removes string(rune). I hope not since it's very useful for debugging and constants as griesemer noted.

I don't think removing it totally is the solution. If string(rune) using utf8.RuneError replacement isn't obvious to someone, it makes more sense to me that they accept that is the case rather than remove it. It is a stone written fact that strings are valid utf8. It behaves just like string([]rune) and string([]byte), which behave just like utf8.EncodeRune, except that for the latter you have to add an import and write way more code.

It's totally fair to make string(int) have to be string(rune) however.

ianlancetaylor commented 5 years ago

I think it's clear that we have to continue to support string(r) where r has type rune. And it's likely that we have to continue to support string(b) where b has type byte or, equivalently, uint8. So what we are talking about removing is the conversions of other integer types to string.

Treating rune and byte/uint8 as a special case is not as bad as it may sound; we already do exactly that for converting values of type []rune and []byte (but not slices of other integer types) to string.

gopherbot commented 4 years ago

Change https://golang.org/cl/212919 mentions this issue: go/analysis: add pass to check string(int) conversions

alanfo commented 4 years ago

I am beginning to wonder whether removing the string(int) conversion is going to be more trouble than it's worth, even for integer types other than rune and byte.

I see from #32479 that the intermediate step of adding a check for this to go vet or other code analyzers presents difficulties if the suggested fix breaks existing code when the integer value being converted overflows rune.

Although one wouldn't expect this to occur in practice, people will no doubt have used these conversions in weird, unexpected but (to them) reasonable ways knowing precisely what the results would be.

Incidentally, just to reinforce the point that the string(byte) conversion should continue to be allowed, my attention was recently drawn (by a link posted on reddit) to this book about writing an interpreter in Go. In the free sample chapter, on page 7, I noticed this function:

func newToken(tokenType token.TokenType, ch byte) token.Token {
    return token.Token{Type: tokenType, Literal: string(ch)}
}

The intended audience for this book is people who know enough Go to be able to make sense of the code and I don't think those people are going to be impressed if the code in the book fails to compile (or vet).

ianlancetaylor commented 4 years ago

It's clearly essential to find how much working code that is correctly using the conversion will break. We don't know that yet.

ianlancetaylor commented 4 years ago

In 1.15 we have a vet warning for this case (#32479). Putting this issue on hold until we have more experience with that. Also waiting until we can assume that everyone is using Go modules with a language version.