vlang / v

Simple, fast, safe, compiled language for developing maintainable software. Compiles itself in <1s with zero library dependencies. Supports automatic C => V translation. https://vlang.io
MIT License
35.8k stars 2.16k forks source link

V should use unsigned integer for sizes #427

Closed ntrel closed 5 years ago

ntrel commented 5 years ago

We should use unsigned types when there's no sign, using a runtime check that n >= 0 is inefficient and is very likely to be forgotten. Forcing signed types encourages the programmer to wrongly use signed types in their code, making their code bug prone if they forget to check i >= 0. It also prevents a programmer allocating and indexing more than int.max bytes of memory.

medvednikov commented 5 years ago

Go uses len int in strings and slices.

divad1196 commented 5 years ago

I don't know for Go, but in C, using integers is a bad practice. I learned to use size_t that is unsigned int. Not only that it is better for ressources, but in the logic of "size" too.

By the way, i am really keen on your project, carry on that way. (i just can't compile V on my computer because of the issue already mentionned)

ntrel commented 5 years ago

We shouldn't be limited by Go's decisions. Go is not marketed as a systems language, V is. We need to do the right thing. (If we're using 32-bit int for sizes e.g. builtin.malloc argument, that's another thing we need to fix).

i7tsov commented 5 years ago

Please, don't.

I can't find link for video now, but at one of C++ conferences Bjarne Stroustrup himself admitted, that introducing unsigned size_t was a mistake, that is too late to revert now. He suggests using unsigned types only if you really need them (like to store and manipulate bits) and use signed integers by default.

divad1196 commented 5 years ago

@i7tsov : before saying that it is a mistake, saying that according to someone it was a mistake, could you just explain why it was a mistake?

Because i personally remember having heard that they should have used them at the beginning:

At the beginning, it was not unsigned and due to a change that i can't remember ( systems, when going from x32 to x64 maybe?) They got a lot of errors in codes that worked well.

Just, size isn't signed by concept, in python, mylist[-1] is a short version for mylist[len(mylist)-1], but still is the size unsigned.

i7tsov commented 5 years ago

Size is never negative, right. But, once you begin doing arithmetic with the value (and there are a lot of cases, where you subtract from the value of the same type as a size of a container), you're screwed. Unsigned hurts more than it helps here.

i7tsov commented 5 years ago

Here's a link to the video with the timestamp (42:41): https://youtu.be/Puio5dly9N8?t=2561 Both Bjarne Stroustrup and Herb Sutter said it's not a good idea to use unsigneds just because you think that this specific value is never negative.

divad1196 commented 5 years ago

@i7tsov : Thanks for the video, "Because people do not know what they do." Is the response i got by watching it. I was hoping something else.

I just could say:

The problem is that signed, usigned AND floating point ARE different, and mixing them without know what will be return is just a programmer error. If you want to access a value according to the len of an array ( maybe the one in the middle (len/2) ), you will have to get a positive value.

I never did a really really big project, so i won't speak on this point, but i never encountered this problem when "playing" with arrays (C-array as well as C++).

Yeah, it is more common mixing signed and unsigned when dealing with arrays than with floating point, is that really a good convenience?

My opinion is that you should know what you do.

erdian718 commented 5 years ago

@medvednikov There is a difference between V and Go: int is platform-dependent in Go, but not in V. I think int should be platform-dependent and use len int like Go.

If int is always a 32 bit integer, there is no different between int and i32, why we need tow way to do the same thing?

i7tsov commented 5 years ago

"Because people do not know what they do."

This argument can be used in both ways: having theoretical possibility to get negative number isn't a problem for people who know what they do. Especially, when this possibility turns practical when you, for example, iterate backwards or calculate difference between indices. If you use only iterators and never calculate index, then it's completely irrelevant, what type size_t is.

I'm talking as a longtime C++ dev (18 years of experience), larger part of which I had to stick with C++03 standard. And from early days I was taught to use signed integers unless you need just bit ops. And I've experienced that wisdom on practice. And a lot of times I had to make (int) v.size() casts.

Having my opinion backed by language creators makes me confident in my point. Although, I admit, this can be holy war topic for people accustomed to using size_t as index counter. V is derived from Go, which use signed integers for representing index. I strongly believe, this should be left so.

divad1196 commented 5 years ago

@i7tsov i still can't get the gain of that use. i don't think you pointed what you got from this experience. Also, i don't consider the years of practice to evaluate someones level, i've seen senior developper's code that was badly done as well as i saw junior's code that was correctly done.

Yes, it is an holy war from the begining, but i am not convinced by convenience practices. Go is told "done to be easy to learn", so maybe it is better for it to use the simpliest way.

Since that the problem is the use of unsigned int, maybe the type itself should not even exists? That's probably what we get by going in the extrems.

I think there's no need for further discuss, But thanks for giving me an other point of view

i7tsov commented 5 years ago

The gain is that you get correct results when you do arithmetic operations on indices.

Existance of unsigned type itself is ok. The problem is that container library forces it's use, dragging unsigned into the rest of your code. It's unpleasant, if you consider using signed integer for default countable type (and this is a good idea, because you do arithmetic on integers all the way).

divad1196 commented 5 years ago

Don't do arithmetic between mixed type, that's all. Want to compute with signed int: keep it casted in a new variable of that type.

"Existance of unsigned type itself is ok", apparently not.

And again: in V you apparently won't be able to use operator on different types.

i7tsov commented 5 years ago

Want to compute with signed int: keep it casted in a new variable of that type.

That's what I was doing all the time: casting unsigned vector::size() to signed, when comparing or performing operations with signed index counter, to suppress compiler warning. That operation could be avoided, if size was signed in first place.

in V you apparently won't be able to use operator on different types.

C/C++ compiler too emits warning when you do arithmetic with mixed signed/unsigned types. You'd better make it explicit to avoid unpleasant errors.

medvednikov commented 5 years ago

He suggests using unsigned types only if you really need them (like to store and manipulate bits) and use signed integers by default.

That's the advice I often heard.

medvednikov commented 5 years ago

V is derived from Go, which use signed integers for representing index. I strongly believe, this should be left so.

That's a very good point. One of V's goals is to make transition from Go easy.

Also, just like Go, the operators require the same types on both sides, so using an additional unsigned type is going to result in a lot of verbose casts, just like in the example @i7tsov gave.

len int is going to stay. Thanks for your input.

ntrel commented 5 years ago

operations with signed index counter

Index variables should be unsigned.

i7tsov commented 5 years ago

Index variables should be unsigned.

Following that logic, what else should be unsigned?

When you have integer values, you make calculations with them. When you do calculation with unsigneds which involves subtraction you may get the result which compares incorrectly. These errors may not be obvious. Even if that's obvious to you, as an experienced programmer, it may be not for person who will maintain your code.

divad1196 commented 5 years ago

@i7tsov Still not a problem for points already said Same could be said for int and double

And what i said before was NOT to cast any time you need But Int s = myvector.size() And use s Having to use the size for computing with incertity of result type probable mean that the size representd something more.

ntrel commented 5 years ago

When you do calculation with unsigneds which involves subtraction you may get the result which compares incorrectly

By that logic, pointers should be signed [due to pointer arithmetic].

V has bounds checks, unlike C and C++. So when indexing a container with a wrapped around unsigned, the container should panic, the same as with a signed negative index.

ntrel commented 5 years ago

salary

I get paid an hourly rate with decimals ;-)

I really don't buy the argument that indexes wraparound often and cause bugs, that's probably rare.

Thanks for the YouTube link, I watched that section. I accept it's rare you need the extra bit. Also the compiler can optimize bounds checking for >= 0 by using an unsigned cast before checking < length. Does the current compiler do this?

A more compelling argument against unsigned is that the programmer may use casts wrongly without putting an assert.

i7tsov commented 5 years ago

I get paid an hourly rate with decimals ;-)

I'd better use cents anyway. Floats are not exactly precise. :)

I really don't buy the argument that indexes wraparound often and cause bugs, that's probably rare.

That's not only about wraparound. Whenever you calculate difference between two indices, you may get overflow, which reverts the meaning of less-great comparison.

lobotony commented 5 years ago

Doesn't Python use signed indices, "-1" meaning the last element of an array?

Delta456 commented 5 years ago

@lobotony It doesn't allows that and I think this is a wrong place to ask.

radare commented 5 years ago

that would be confusing because in C negative indexes are perfectly fine (and dangerous)

On 30 Sep 2019, at 16:16, Tony Kostanjsek notifications@github.com wrote:

Doesn't Python use signed indices, "-1" meaning the last element of an array?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/vlang/v/issues/427?email_source=notifications&email_token=AAG75FSXGOIKCTTEWJ2BM43QMIC4DA5CNFSM4H2YOUEKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD75ZOGI#issuecomment-536581913, or mute the thread https://github.com/notifications/unsubscribe-auth/AAG75FRXL2YOFPB65CJNYJ3QMIC4DANCNFSM4H2YOUEA.

lobotony commented 5 years ago

@lobotony It doesn't allows that and I think this is a wrong place to ask.

Wrong.

Python 2.7.10 (default, Feb 22 2019, 21:55:15) 
[GCC 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> l = [1,2,3]
>>> l[0]
1
>>> l[-1]
3
>>> 

But I agree that negative indices in V could be confusing or dangerous, because in some cases they could work as intended (e.g. argument to a method that can give the last for -1), whereas in other cases they would clobber you memory (e.g. index into raw buffer).

radare commented 5 years ago

they can be useful too :) as well as slices and the spread operator

On 30 Sep 2019, at 17:10, Tony Kostanjsek notifications@github.com wrote:

@lobotony https://github.com/lobotony It doesn't allows that and I think this is a wrong place to ask.

Wrong.

Python 2.7.10 (default, Feb 22 2019, 21:55:15) [GCC 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14)] on darwin Type "help", "copyright", "credits" or "license" for more information.

l = [1,2,3] l[0] 1 l[-1] 3

But I agree that negative indices in V could be confusing or dangerous, because in some cases they could work as intended (e.g. argument to a method that can give the last for -1), whereas in other cases they would clobber you memory (e.g. index into raw buffer).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/vlang/v/issues/427?email_source=notifications&email_token=AAG75FVSMVO6LEMS725SX7LQMIJF5A5CNFSM4H2YOUEKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD757TAA#issuecomment-536607104, or mute the thread https://github.com/notifications/unsubscribe-auth/AAG75FREE4WJY23ALY7O62LQMIJF5ANCNFSM4H2YOUEA.

Delta456 commented 5 years ago

@lobotony Oops I meant V not Python. Sorry :sweat_smile: