nspcc-dev / .github

Organization-wide documentation and things
2 stars 1 forks source link

Pointer vs Value receiver in GO #29

Open carpawell opened 3 months ago

carpawell commented 3 months ago

Is your feature request related to a problem? Please describe.

I'm always frustrated when I want to start a thread on a review more than once if it ends every time with no conclusion about what we (as a team) think about it. Now I want to fix whether we think using mixed receivers for the same struct is acceptable or not.

Describe the solution you'd like

Write a rule in this repo about our code style with receivers: every struct always has either a pointer or a value receiver for every method.

Describe alternatives you've considered

Do not have any rules about it.

Additional context

Go developers do not recommend using different methods set for values and pointers: https://go.dev/doc/faq#methods_on_values_or_pointers. I also do not like code like thisFunsUsesPointerInterface(&someVar) because it can lead to errors if a dev is not careful enough. In fact, I do not think it is required to use & and * in code at all, not sure why it is important to use pointers in some cases and values in others. If a type is acquired from some external code, it should use it as is, and do not convert it to a pointer/value every time trying to predict whether it is can/should be changed or not.

It is not a problem that I want to fix and change when I see mixed receivers, but I do not understand when there are additional commits that do change a receiver and when there are new threads in PRs that do request to change a receiver.

roman-khimov commented 3 months ago
  1. State-changing methods are inevitably pointer-based.
  2. Most of the time structs are sufficiently bigger than a slice in byte size and values need to be copied.
  3. Compiler can avoid copying in many cases, but it's "can avoid" vs "won't do for sure".
  4. I've done quite some C, sorry.
  5. So I naturally prefer pointers unless the type is so small that pointers incur more overhead (some byte-sized integer with enumeration). To me it's more predictable and unified.
  6. But I'm not sure we can solve it once and for all. Value-based code is safer in some cases.
carpawell commented 3 months ago

@roman-khimov, I understand your points. Just to be sure we are on the same wave: I do not want every struct to have a pointer receiver or a value receiver. I do not want to have disputes about what is "better". I only want every struct to have all its methods to be defined on either pointer or value. So not mix them. Mostly because I think such code is less error-prone (different interface sets and copying that leads to no-op struct changes by a mistake).

AnnaShaleva commented 3 months ago

I only want every struct to have all its methods to be defined on either pointer or value.

It may be impossible, because we have a lot of places where we need to define method on a struct and at the same time this struct has to implement unmarshaller/decoder/any other state-changing method.

carpawell commented 3 months ago

we need to define method on a struct

Can you explain it? Does it mean you have some methods that need value receivers?

AnnaShaleva commented 3 months ago

Does it mean you have some methods that need value receivers?

Like Roman said, > Value-based code is safer in some cases.

Sometimes we need value receivers, not pointers.

carpawell commented 3 months ago

If you are not gonna change receiver, it is OK to be a value. If you have to change a receiver it has to be pointer. In what case is it safer to be a value? I would say it is even more dangerous to rely on value receivers (as you said "we need value receivers", i know no such cases) when it also has pointer methods, cause you can pass it as a copy and try to change a copy without any error/warning (but i hope our linter can see such cases).

cthulhu-rider commented 2 weeks ago

ive seen this go faq article many times, and did not get proper explanation why should all method receivers be the same

consistency. Some type can have all value-based methods, and now i need to intro state-changing method. So what? Change receivers of all existing methods? The types gonna be inconsistent otherwise? ridiculous

efficiency section talking bout some "big" structs. But any efficiency guide says dont analyze the code - run benchmarks. If (!) code will show proper performance improvement due to pointer method transition - do it. There is no big or mid or light types to me

i see following methods accepting values pros:

  1. safety: when caller sees method accepting value, he can be calm about the instance safety and may not make a safe-copy. Methods on pointers may require blind precopying, and information about whether the instance is changing may be hidden and changeable
  2. chain calls (https://go.dev/play/p/mCySyVt39ke)
  3. read-only interfaces (https://go.dev/play/p/xtGgQ7Et0zW)
  4. potential NPEs are thrown by the caller side, not from within the method

i was caught once by https://pkg.go.dev/github.com/nspcc-dev/neo-go@v0.106.3/pkg/core/transaction#Transaction.Hash. From my expectations, it's read-only. However, strictly speaking, the method is on a pointer, so I had to precopy the instance, although I easily forgot about it due to expected semantics

almost hundred years of code development, and pointer mutations are still some of the most insidious. The simplest rule "if you don't change it, don't take the pointer" saves almost always

carpawell commented 2 days ago

Some type can have all value-based methods, and now i need to intro state-changing method.

Strange situation. I cannot imagine this. Almost all types are ok to be pointer-based, value usage is so specific that I think almost all the cases are not your example ("was sure it would not ever be changed but now wow surprise, it should be changeable").

  1. safety

You should never rely on it. We have slices, we have maps, we have nested pointers in structs. If you need to use a method, you should use it (what else alternatives can be?), read the docs, see the implementation if you are a dev and are debugging some bugs. We are doing and expecting strange things too often to rely on some pointer/values rules. All the other expectations about safety are false. You may follow such rules and any different dev in your own team or some external lib may not follow.

  1. chain calls
  2. read-only interfaces

Not sure i understood the example. Pros, cons, why yes, why no, etc.

  1. potential NPEs

How does it relate to this issue? You will always have some pointers-based methods, and some of them may panic. A caller just has to use structs correctly. I am not sure there is a certain answer to whether an incorrect value is better than an incorrect (nil) pointer.

roman-khimov commented 2 days ago

i was caught once by https://pkg.go.dev/github.com/nspcc-dev/neo-go@v0.106.3/pkg/core/transaction#Transaction.Hash

BTW, I have to admit that in this particular case documentation totally sucks. It's been that way for a long long time so we've got used to this behavior, but it was never properly explained. Time to improve, https://github.com/nspcc-dev/neo-go/pull/3683

cthulhu-rider commented 2 days ago

You should never rely on it.

I dont believe there is a normal case when function accepts complex struct by value to mutate its reference-type field, always a bug to me. 90% info about instance safety can be got from presence of an asterisk. In rare cases, when method is safe but accepts pointer as an optimization (proven by both benchmem and GC analysis), doc remark can free concerned user from deep-copying

We are doing and expecting strange things too often to rely on some pointer/values rules.

i dont agree expecting pointer accuracy and transparency with pointers is a strange thing. Being a caller, u expect smth natural 1st of all

chain calls read-only interfaces Not sure i understood the example. Pros, cons, why yes, why no, etc.

i left code snippets, they dont compile. Dropping pointers from method definitions make them work. So, sometimes pointers can prevent something from compiling that essentially should

potential NPEs How does it relate to this issue? You will always have some pointers-based methods, and some of them may panic.

these are panics on lib side, while calling ro methods on nil throws NPE on caller's side. In the latter case, i can insta to exclude the lib problem and that seems pretty good