Avoid renaming imports - Githubissues

schomatis commented 6 years ago

As suggested here.

The most problematic thing is that the aliases used are not homogeneous. I'll be eliminating some of those aliases during the work of https://github.com/ipfs/go-ipfs/issues/5050, although I don't think it's relevant to eliminate all of them now from the code it's something that should be taken in consideration for future contributors (maybe this could be added to the Go contribution guidelines).

I understand that it is useful to rename packages in order to remove redundant prefixes like go-, but if that was the original purpose we should document it as a special case, was there any other reason to rename imports?

kevina commented 6 years ago

I do not necessary agree with this. What do others think?

Renaming a long package name that is used frequently import can lead to shorter and easier to read code.

I also don't see why this is a big deal. Standard tools work well with renamed imports and it is easy to determine an import by just looking at the import list.

schomatis commented 6 years ago

I do not necessary agree with this. What do others think?

@kevina Thanks for your feedback, yes I'd love to get more opinions to see if I'm jumping the gun here.

My main motivation is to avoid using different names in different files for the same package which makes the code harder to read (see https://github.com/ipfs/go-ipfs/issues/5055) but I would like to know if there are any other motivations to renaming packages than just making them shorter. I'm not sure if shorter names means easier to read code, at least not any name and not any variation of it, maybe we could start standardizing some of those shorter names.

kevina commented 6 years ago

@schomatis the length of the import really depends on if the import name is an important part of the name, if it is not I see it as noise and hence prefer shorter names.

I am not against some cleaning up of the code so we use the same alias for an important, especially within the same (or similar) packages.

I also agree the protobuf code is particularly bad and I won't object to cleaning it up.

whyrusleeping commented 6 years ago

I disagree with the go convention here. Since go package names arent required to match their paths, Its far clearer to explicitly name them.

Which of these is more clear?

package main

import (
    "github.com/foo/bar/baz"
)

func main() {
    cats.Meow()
}

or

package main

import (
    cats "github.com/foo/bar/baz"
)

func main() {
    cats.Meow()
}

this is a pretty severe example, but i have encountered some pretty annoying mismatches here. Even just the difference between go-foo in the path and foo.X in the code adds non-zero cognitive load.

In any case, it seems like your desire here is to make sure we use the same names everywhere, and that I 100% agree with. we should definitely always use the same name for every package (assuming no collisions)

schomatis commented 6 years ago

In any case, it seems like your desire here is to make sure we use the same names everywhere, and that I 100% agree with. we should definitely always use the same name for every package (assuming no collisions)

Yes, that's the main objective here, to have homogeneous names we could add to the documentation so the reader quickly grasps what the code is talking about when there is something like an ft. prefix lying around.

Since go package names arent required to match their paths,

Oh, I wasn't aware of that, then I'm all for explicitly naming the package, but let's try to use the package name, not a shortened version of it, e.g., the prefix ft. in place of the package name unixfs is totally meaningless to a new reader.

Mr0grog commented 6 years ago

FWIW, I definitely found a need to use different names from the ones actually used in the package when adding an example to go-ipns (Using crypto instead of ic).

I think it also makes for super confusing generated docs (e.g. https://godoc.org/github.com/ipfs/go-ipns), where the import statement isn’t included (yes, they become links on godoc.org, but you still have click them to see what they are, and they don’t become links [yet — would love help] on our docs site).

Anyway, I just want to agree with @schomatis’s comment above. Totally get explicitly naming them, but we should try and use the same names everywhere and also try not to abbreviate them too much. In my experience, readability counts for a lot more than easy typeability, especially in an open source project when you want to bring in new contributors or your docs aren’t yet perfect :)

whyrusleeping commented 6 years ago

Yeah, +1 on not using shortened names

lanzafame commented 6 years ago

Totally get explicitly naming them, but we should try and use the same names everywhere and also try not to abbreviate them too much. In my experience, readability counts for a lot more than easy typeability

Agree, though I believe a good consistent abbreviation also makes the code more readable, i.e. multiaddr -> ma. But another aspect is all the multiaddr and libp2p packages that mirror stdlib packages and how go interprets them, especially when they have -s in the name, for example:

package main

import (
    ma "github.com/multiformats/go-multiaddr"
    madns "github.com/multiformats/go-multiaddr-dns"
)

func main() {
    addr := ma.NewMultiaddr(...)
    resolvedAddr, err := madns.Resolve(...)
}

vs.

package main

import (
    "github.com/multiformats/go-multiaddr"
    "github.com/multiformats/go-multiaddr-dns" // go will interpret the word after the last hyphen as the package name
)

func main() {
     addr := multiaddr.NewMultiaddr(...)
    resolvedAddr, err := dns.Resolve(...)
}

So I find the first example to be more readable because I know that it is performing a multiaddr-dns Resolve and though we could follow this advice: https://rakyll.org/style-packages/#renames-should-follow-the-same-rules, we aren't actually 'replacing' stdlib dns in this case, it is a 'different' form of dns.

kevina commented 6 years ago

I agree with @lanzafame and would like to add that not abbreviating packages is good advice for the standard go packages as they are carefully named and designed to be part of the identifier when referring to symbols inside the package. The same can not be said for other packages. For example multiaddr.NewMultiaddr is just noise. In general overlong identifiers can hinder code readability not enhance it.

ghost commented 6 years ago

Agreed on short identifiers, and agreed on keeping identifiers explicit.

Mr0grog commented 6 years ago

Agree, though I believe a good consistent abbreviation also makes the code more readable, i.e. multiaddr -> ma.

I have to say that I strongly disagree with this (consistency good, but extra-short abbreviations not so much). In fact, I believe there is some literature on this — I found this paper in a very quick search this morning: https://www.academia.edu/13557160/Whats_in_a_name_A_study_of_identifiers . It indicates that short 1-2 letter abbreviations reliably perform very poorly on comprehension against both full words and longer abbreviations (e.g. crypto might be a useful abbreviation for cryptography, but not c or cr). Moreover, it shows full words perform significantly better than any kind of abbreviation among groups that are traditionally less represented or who tend to report lower confidence in their work (even if it is objectively measured by others to be just as good) like women and minorities. You also won’t have to look very far to find conference talks and blog posts about how contributing to open-source can be intimidating. Most things that reduce that barrier should be considered effort well-spent.

That should be a big deal for a project that is: a) open source, valuing contributors of varied familiarity and expertise with the codebase and b) attempting to encourage more people who are not traditionally well represented in CS to contribute (at least I know this is the case for Protocol Labs, and I think it is the case for this project). Less abbreviation (especially avoiding extra short abbreviations like ma) increases people’s confidence in their understanding and ability to contribute, and that is really important. I know a lot of people feel differently about this, but longer, fuller names really do make a project (especially a big one with many packages and components) easier to understand and contribute to.

Consistency is important too, but I’d also love to point out that enforcing consistency in very short abbreviations (which tend to be relatively arbitrary) is pretty tough in practice (in my experience, at least), while simply using the package name tends to be an easy rule for everyone to follow when creating new files or updating existing ones.

But another aspect is all the multiaddr and libp2p packages that mirror stdlib packages and how go interprets them, especially when they have -s in the name

Totally agree with the need to be explicit on naming these. It feels like, after the issues were pointed out there’s general agreement on this one :)

Mr0grog commented 6 years ago

The way I have typically suggested/encouraged/enforced on naming (depending on my role in a project) is this (from another project I work on):

Avoid abbreviations. Abbreviations tend to overlap a lot between domains, so they can be very ambiguous and confusing. Things that are primarily referred to by abbreviation in common language are ok to abbreviate, e.g. “URL,” “FAQ,” “ID,” “crypto,” but avoid “utils,” “idx,” etc. Single-letter names are ok in contexts where their value is totally arbitrary, e.g. function compare (a, b), or in the rare places where their use is normal, e.g. x/y/z in math functions referring to points.

mikeal commented 6 years ago

short 1-2 letter abbreviations reliably perform very poorly on comprehension against both full words and longer abbreviations

I'll also just add that this lack of comprehension can lead to serious code problems, including security issues. If you dig through the Heartbleed fix in OpenSSL you'll notice that the bug was due to someone using one two letter macro when they actually meant to use a different two letter macro :(

Mr0grog commented 6 years ago

😱 I had no idea that was part of the problem!

schomatis commented 6 years ago

So, I'm closing this issue as we're getting a bit off-topic from its original premise, as it turns out we do need to name the packages because of how Go handles import paths. From what I've read here I think we all agree on:

Explicitly name all packages when importing (even if to repeat the import path).
Always use the same package name throughout the code (except when there's a possible name clash).
Document in the contribution guidelines or the coding style any package that will have a different name from the import path so there's a clear reference. Some general guidelines like removing the go- or go-ipfs- prefixes can also be included there.

The major difference I see is when (or whether) use shorter package names than the originals, but I think that that can be studied on a case by case basis in different issues where we can analyze the circumstances of each case in particular.

(Feel free to re-open this if you don't agree with any of the previous points and want to continue the discussion.)

Regarding the Heartbleed vulnerability (AFAIK) the main cause was a missing check in the server that accepted any payload length delivered by the client. Yes, there are a lot of awfully named macros like n2s (and related) that definitely don't help, but in this case they didn't cause the bug.

raulk commented 5 years ago

Just adding a thought here which I think hasn't been brought up yet. There's a subtle difference between importing:

the root of a repo (especially if it follows the custom of being prefixed with go-, including hyphens, an illegal character for package names, therefore inherently producing a mismatch between the last import token and the package name).
a package whose name does not match the last token of the import path.
a package whose name collides with something else.
a package whose name matches the last token of the import path.

For 1-3, explicitly naming import aliases reduces ambiguity, surprise factor, and cognitive load.

For 4, for some go devs it increases cognitive load, produces misaligned import blocks (aesthetics), appears anti-customary, causes surprise, and plays badly with tooling (many tools will remove redundant aliases, or warn you).

Where this note is coming from: in the libp2p project, we recently consolidated interfaces into https://github.com/libp2p/go-libp2p-core/ and we took the opportunity to align package names with import paths so that we can drop redundant import aliases.

Compare:

with

The former is much neater than the latter, while preserving all fidelity.

lanzafame commented 5 years ago

No one is disagreeing with those points @raulk, it just isn't the case for the ipfs go packages. So the convention still stands to keep all the named imports of ipfs packages to aid in clarity. Granted it also doesn't apply to libp2p packages outside the scope of the go-libp2p-core package...

ipfs / kubo

Avoid renaming imports #5056