Closed schomatis closed 6 years ago
I do not necessary agree with this. What do others think?
Renaming a long package name that is used frequently import can lead to shorter and easier to read code.
I also don't see why this is a big deal. Standard tools work well with renamed imports and it is easy to determine an import by just looking at the import list.
I do not necessary agree with this. What do others think?
@kevina Thanks for your feedback, yes I'd love to get more opinions to see if I'm jumping the gun here.
My main motivation is to avoid using different names in different files for the same package which makes the code harder to read (see https://github.com/ipfs/go-ipfs/issues/5055) but I would like to know if there are any other motivations to renaming packages than just making them shorter. I'm not sure if shorter names means easier to read code, at least not any name and not any variation of it, maybe we could start standardizing some of those shorter names.
@schomatis the length of the import really depends on if the import name is an important part of the name, if it is not I see it as noise and hence prefer shorter names.
I am not against some cleaning up of the code so we use the same alias for an important, especially within the same (or similar) packages.
I also agree the protobuf code is particularly bad and I won't object to cleaning it up.
I disagree with the go convention here. Since go package names arent required to match their paths, Its far clearer to explicitly name them.
Which of these is more clear?
package main
import (
"github.com/foo/bar/baz"
)
func main() {
cats.Meow()
}
or
package main
import (
cats "github.com/foo/bar/baz"
)
func main() {
cats.Meow()
}
this is a pretty severe example, but i have encountered some pretty annoying mismatches here. Even just the difference between go-foo
in the path and foo.X
in the code adds non-zero cognitive load.
In any case, it seems like your desire here is to make sure we use the same names everywhere, and that I 100% agree with. we should definitely always use the same name for every package (assuming no collisions)
In any case, it seems like your desire here is to make sure we use the same names everywhere, and that I 100% agree with. we should definitely always use the same name for every package (assuming no collisions)
Yes, that's the main objective here, to have homogeneous names we could add to the documentation so the reader quickly grasps what the code is talking about when there is something like an ft.
prefix lying around.
Since go package names arent required to match their paths,
Oh, I wasn't aware of that, then I'm all for explicitly naming the package, but let's try to use the package name, not a shortened version of it, e.g., the prefix ft.
in place of the package name unixfs
is totally meaningless to a new reader.
FWIW, I definitely found a need to use different names from the ones actually used in the package when adding an example to go-ipns
(Using crypto
instead of ic
).
I think it also makes for super confusing generated docs (e.g. https://godoc.org/github.com/ipfs/go-ipns), where the import
statement isn’t included (yes, they become links on godoc.org
, but you still have click them to see what they are, and they don’t become links [yet — would love help] on our docs site).
Anyway, I just want to agree with @schomatis’s comment above. Totally get explicitly naming them, but we should try and use the same names everywhere and also try not to abbreviate them too much. In my experience, readability counts for a lot more than easy typeability, especially in an open source project when you want to bring in new contributors or your docs aren’t yet perfect :)
Yeah, +1 on not using shortened names
Totally get explicitly naming them, but we should try and use the same names everywhere and also try not to abbreviate them too much. In my experience, readability counts for a lot more than easy typeability
Agree, though I believe a good consistent abbreviation also makes the code more readable, i.e. multiaddr
-> ma
. But another aspect is all the multiaddr
and libp2p
packages that mirror stdlib packages and how go
interprets them, especially when they have -
s in the name, for example:
package main
import (
ma "github.com/multiformats/go-multiaddr"
madns "github.com/multiformats/go-multiaddr-dns"
)
func main() {
addr := ma.NewMultiaddr(...)
resolvedAddr, err := madns.Resolve(...)
}
vs.
package main
import (
"github.com/multiformats/go-multiaddr"
"github.com/multiformats/go-multiaddr-dns" // go will interpret the word after the last hyphen as the package name
)
func main() {
addr := multiaddr.NewMultiaddr(...)
resolvedAddr, err := dns.Resolve(...)
}
So I find the first example to be more readable because I know that it is performing a multiaddr-dns
Resolve
and though we could follow this advice: https://rakyll.org/style-packages/#renames-should-follow-the-same-rules, we aren't actually 'replacing' stdlib dns
in this case, it is a 'different' form of dns
.
I agree with @lanzafame and would like to add that not abbreviating packages is good advice for the standard go packages as they are carefully named and designed to be part of the identifier when referring to symbols inside the package. The same can not be said for other packages. For example multiaddr.NewMultiaddr
is just noise. In general overlong identifiers can hinder code readability not enhance it.
Agreed on short identifiers, and agreed on keeping identifiers explicit.
Agree, though I believe a good consistent abbreviation also makes the code more readable, i.e.
multiaddr
->ma
.
I have to say that I strongly disagree with this (consistency good, but extra-short abbreviations not so much). In fact, I believe there is some literature on this — I found this paper in a very quick search this morning: https://www.academia.edu/13557160/Whats_in_a_name_A_study_of_identifiers . It indicates that short 1-2 letter abbreviations reliably perform very poorly on comprehension against both full words and longer abbreviations (e.g. crypto
might be a useful abbreviation for cryptography
, but not c
or cr
). Moreover, it shows full words perform significantly better than any kind of abbreviation among groups that are traditionally less represented or who tend to report lower confidence in their work (even if it is objectively measured by others to be just as good) like women and minorities. You also won’t have to look very far to find conference talks and blog posts about how contributing to open-source can be intimidating. Most things that reduce that barrier should be considered effort well-spent.
That should be a big deal for a project that is: a) open source, valuing contributors of varied familiarity and expertise with the codebase and b) attempting to encourage more people who are not traditionally well represented in CS to contribute (at least I know this is the case for Protocol Labs, and I think it is the case for this project). Less abbreviation (especially avoiding extra short abbreviations like ma
) increases people’s confidence in their understanding and ability to contribute, and that is really important. I know a lot of people feel differently about this, but longer, fuller names really do make a project (especially a big one with many packages and components) easier to understand and contribute to.
Consistency is important too, but I’d also love to point out that enforcing consistency in very short abbreviations (which tend to be relatively arbitrary) is pretty tough in practice (in my experience, at least), while simply using the package name tends to be an easy rule for everyone to follow when creating new files or updating existing ones.
But another aspect is all the
multiaddr
andlibp2p
packages that mirror stdlib packages and howgo
interprets them, especially when they have-s
in the name
Totally agree with the need to be explicit on naming these. It feels like, after the issues were pointed out there’s general agreement on this one :)
The way I have typically suggested/encouraged/enforced on naming (depending on my role in a project) is this (from another project I work on):
Avoid abbreviations. Abbreviations tend to overlap a lot between domains, so they can be very ambiguous and confusing. Things that are primarily referred to by abbreviation in common language are ok to abbreviate, e.g. “URL,” “FAQ,” “ID,” “crypto,” but avoid “utils,” “idx,” etc. Single-letter names are ok in contexts where their value is totally arbitrary, e.g.
function compare (a, b)
, or in the rare places where their use is normal, e.g.x/y/z
in math functions referring to points.
short 1-2 letter abbreviations reliably perform very poorly on comprehension against both full words and longer abbreviations
I'll also just add that this lack of comprehension can lead to serious code problems, including security issues. If you dig through the Heartbleed fix in OpenSSL you'll notice that the bug was due to someone using one two letter macro when they actually meant to use a different two letter macro :(
😱 I had no idea that was part of the problem!
So, I'm closing this issue as we're getting a bit off-topic from its original premise, as it turns out we do need to name the packages because of how Go handles import paths. From what I've read here I think we all agree on:
go-
or go-ipfs-
prefixes can also be included there.The major difference I see is when (or whether) use shorter package names than the originals, but I think that that can be studied on a case by case basis in different issues where we can analyze the circumstances of each case in particular.
(Feel free to re-open this if you don't agree with any of the previous points and want to continue the discussion.)
Regarding the Heartbleed vulnerability (AFAIK) the main cause was a missing check in the server that accepted any payload length delivered by the client. Yes, there are a lot of awfully named macros like n2s
(and related) that definitely don't help, but in this case they didn't cause the bug.
Just adding a thought here which I think hasn't been brought up yet. There's a subtle difference between importing:
go-
, including hyphens, an illegal character for package names, therefore inherently producing a mismatch between the last import token and the package name).For 1-3, explicitly naming import aliases reduces ambiguity, surprise factor, and cognitive load.
For 4, for some go devs it increases cognitive load, produces misaligned import blocks (aesthetics), appears anti-customary, causes surprise, and plays badly with tooling (many tools will remove redundant aliases, or warn you).
Where this note is coming from: in the libp2p project, we recently consolidated interfaces into https://github.com/libp2p/go-libp2p-core/ and we took the opportunity to align package names with import paths so that we can drop redundant import aliases.
Compare:
with
The former is much neater than the latter, while preserving all fidelity.
No one is disagreeing with those points @raulk, it just isn't the case for the ipfs go packages. So the convention still stands to keep all the named imports of ipfs packages to aid in clarity. Granted it also doesn't apply to libp2p packages outside the scope of the go-libp2p-core package...
As suggested here.
The most problematic thing is that the aliases used are not homogeneous. I'll be eliminating some of those aliases during the work of https://github.com/ipfs/go-ipfs/issues/5050, although I don't think it's relevant to eliminate all of them now from the code it's something that should be taken in consideration for future contributors (maybe this could be added to the Go contribution guidelines).
I understand that it is useful to rename packages in order to remove redundant prefixes like
go-
, but if that was the original purpose we should document it as a special case, was there any other reason to rename imports?