JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.44k stars 5.46k forks source link

consistent underscore naming policy #1539

Closed StefanKarpinski closed 11 years ago

StefanKarpinski commented 11 years ago

This is kind of a blanket issue that intersects with API cleanup and naming consistency, but we need to figure our shit out regarding when to underscore and when not to underscore. One possibly controversial proposal (that I'm not really sure I favor), would be using underscores consistently, which turns especially predicates like isempty into the much more annoying is_empty, but which would be addressed by adding support for names that end in ? and writing empty? instead. Another proposal would be that names in base, as the basic "grammar" of Julia don't use underscores whereas things in packages and user code use underscores. Or we just go fully Germanic and don't use underscores ever. Or we can just keep on being completely inconsistent.

johnmyleswhite commented 11 years ago

I actually like having underscores everywhere, but also think that ? would be a great idea.

While we're on the topic of consistency in names, did we ever standardize on push! and similar examples?

StefanKarpinski commented 11 years ago

No, but I think we should. I'll make an issue for that.

StefanKarpinski commented 11 years ago

Oh, wait, we already have one: #907.

staticfloat commented 11 years ago

I'm ambivalent on underscores; I have no great love or hate of them, I could go either way.

For the question-mark syntax, since we need parens to call a function, are you suggesting the following:

if empty?( a )
  println( "a is empty!" )
end
StefanKarpinski commented 11 years ago

Yeah, that's the idea. It would make empty? a valid identifier for all purposes, but the convention would be only to use it for functions that are predicates. Not entirely sold on the idea, but it's a possibility.

JeffBezanson commented 11 years ago

I'd rather not, since we already allow the syntax a?1:0. I would also personally find is_empty very annoying, as would matlab users. I believe there would be too many other functions affected as well. Would we say flip_lr, cum_sum, fft_shift, etc.? Being 100% consistent here is not realistic IMO.

johnmyleswhite commented 11 years ago

I agree that isempty and cumsum aren't bad, although I think flip_lr and fft_shift are actually much better.

Maybe there's an intermediate compromise where there's an accepted set of prefixes in the way that ! is a suffix? in and cum seem like legitimate prefixes to me, while fft doesn't.

JeffBezanson commented 11 years ago

I like the idea of standard prefixes, so you can predict whether there is an underscore. The only wrinkle is that it's confounded by established names like strerror, getenv, and in the case of matlab fftshift.

jacg commented 11 years ago

I would like to register my support for allowing '?' as a constituent character of identifiers. Someone else suggested that we be obliged to separate any preceding name from the '?' in the ternary operator. Unlike the proposer, I'm not convinced that this is, in and of itself, a good thing to impose, but I do think it is a price worth paying for the ability to name predicates with a trailing '?'.

StefanKarpinski commented 11 years ago

Another fact that pushes me in favor of using trailing ? for predicates: we are currently pretty inconsistent about the is prefix, probably because it's linguistically awkward a lot of the time. For example, contains is a predicate, but doesn't begin with is; we could write iscontained, but that would be super awkward, imo. The same goes for has which is also a predicate. The list goes on. First off, contains and has should probably be merged into in, but then that still doesn't have an is prefix. I guess we could do iselem, but that's pretty awkward too. I'd rather call it in? or elem?

johnmyleswhite commented 11 years ago

Since we're invoking intuitions from natural language, it's pretty clear that English predominantly makes questions by using tone at the end of sentences, which is what the ? gets at it. That makes its use natural for an English speaker, whereas the is_ prefix (which I nevertheless like) is going to fail in every case where the English sentence would contain "does" instead of "is".

Another perspective is that predicates derived from adjectives will work with is, but predicates from verbs need does. Stefan's contains example is one case of this problem.

JeffBezanson commented 11 years ago

I don't think it's necessary for every boolean function to be named consistently.

sth4nth commented 11 years ago

why not camel. Functions begin with lowercase letters, types begin with upper case letters. isEmpty vs Matrix. Underscore can produce reeeeealy long names. I dont want to name my linear regression with automatic relenvance determination prior using empirical bayesian fitted by EM function as lin_reg_ard_eb_em()

diegozea commented 11 years ago

Exclamation mark ! when you are going to modify the data and ? for function asking things, looks too easy of understand and use. Even for Spanish speakers like me. In Spanish. ( And I like the idea of something consistent, but can be awkward introduce some extra rule for ternary operator )

In Spanish, we grammatically use two sings, one for open ( ¿ ¡ ) and one for close ( ! ? ). But English it's so popular and using a trailing sign is more faster and comfortable, we use to use the English notation in the colloquial writing now. Have a prefix instead of a suffix have a clear usability went you are using tab for auto-completion or if you forget the function name (but you know is a kind of question to data) [ it's common to me, because I'm used to switch between programming languages and because English it's not natural for my ] [ for ! is good at the end, because some function can have a version with and without the ! ]

Using ¿ ¡ symbols in beginning it's not going to collide with nothing... But can be awkward to native English speakers?

fill!()
empty?()

?empty()

¿empty()

The last looks to good to my, because I'm a Spanish speaker. But I'm comfortable whit the two first.

P.S.: I'm think that dequeue functions need to be consistent with the ! rules [ https://groups.google.com/forum/?hl=es&fromgroups=#!topic/julia-dev/wKIlMaSj0YE ]

johnmyleswhite commented 11 years ago

As you said, opening ¿ ¡ are clearly on their way out of the layperson's Spanish language. So I don't think adding them would help Julia.

I do agree that we should strive for greater consistency in naming functions that modify their inputs.

I would like to add ?, but that interferes with the ternary operator.

-- John

On Jan 6, 2013, at 5:23 PM, diegozea notifications@github.com wrote:

Exclamation mark ! when you are going to modify the data and ? for function asking things, looks too easy of understand and use. Even for Spanish speakers like me. In Spanish. ( And I like the idea of something consistent, but can be awkward introduce some extra rule for ternary operator )

In Spanish, we grammatically use two sings, one for open ( ¿ ¡ ) and one for close ( ! ? ). But English it's so popular and using a trailing sign is more faster and comfortable, we use to use the English notation in the colloquial writing now. Have a prefix instead of a suffix have a clear usability went you are using tab for auto-completion or if you forget the function name (but you know is a kind of question to data) [ it's common to me, because I'm used to switch between programming languages and because English it's not natural for my ] [ for ! is good at the end, because some function can have a version with and without the ! ]

Using ¿ ¡ symbols in beginning it's not going to collide with nothing... But can be awkward to native English speakers?

fill!() empty?()

?empty()

¿empty() The last looks to good to my, because I'm a Spanish speaker. But I'm comfortable whit the two first.

P.S.: I'm think that dequeue functions need to be consistent with the ! rules [ https://groups.google.com/forum/?hl=es&fromgroups=#!topic/julia-dev/wKIlMaSj0YE ]

— Reply to this email directly or view it on GitHub.

StefanKarpinski commented 11 years ago

Trailing ? can be dealt with pretty easily – an identifier immediately followed by ? can just be parsed as an identifier. Putting the ? in the ternary operator right after an identifier is kind of a readability snafu anyway (although we do it a few times in base, that can easily be fixed); technically, it would be a breaking change, however.

Keno commented 11 years ago

Do we have an issue open for ? at the end of function names?

johnmyleswhite commented 11 years ago

I don't believe we do. We should add one.

vtjnash commented 11 years ago

what was the decision on this? i recall that we would continue being inconsistent, preferring underscores for most names, but dropping them for short, common names like isempty and established names like getenv. if so, this issue can presumably be closed.

StefanKarpinski commented 11 years ago

General thinking, discussed with Jeff the other day is that names without spaces are in some sense "atomic" or "indivisible" – they are the words from which sentences are formed. Functions with spaces (i.e. underscores) are conceptually sentences, e.g. frob_the_foo – a function that frobs a foo. Base should only provide atomic functionality – i.e. words – not full sentences, so rather than providing a frob_the_foo function, it should provide a generic frob verb which can be applied to all kinds of things. So the basic takeaway is that base should generally not have underscores, but in user code it's fine since wrapping a "sentence" up as a reusable function is a good thing.

StefanKarpinski commented 11 years ago

To that end, we have been endeavoring to eradicate underscores from base, either by breaking them down into more basic pieces or by not exporting them at all.

toivoh commented 11 years ago

How about e.g. Base.show_unquoted? I've been thinking that it should be exported from Base, but I'm not sure how to do that in an underscore-free way in accordance with the above.

StefanKarpinski commented 11 years ago

This doesn't have to be set in stone, but I've been finding that looking for underscores is an awfully good way to find the smellier parts of base.

StefanKarpinski commented 11 years ago

I think we've done a fair enough job of this for 0.1 so I'm retagging this as 0.2.

wlbksy commented 11 years ago

As there aren't much Boolean function in Julia, I think it's a good idea to make question mark as a prefix.

This would benefit people who needs autocomplete to help them remembering which Boolean function they should use, i.e., ?contains? ?empty

The parser should consider ? a "ternary operator" only when ? has space both before and after it.

pao commented 11 years ago

@wlbksy that was #1910. We decided not to do that.

JeffBezanson commented 11 years ago

each_line and each_match have been de-underscored.

Another idea: parse_int to parseint, and remove parse_bin, parse_hex, and parse_oct. No need for special names for 3 bases.

ViralBShah commented 11 years ago

How about renaming findn_nzs to findnz? I came up with the original name and I have hated it every single time I have to use it.

JeffBezanson commented 11 years ago

Go ahead. Any underscore removal is usually good.

JeffBezanson commented 11 years ago

I think the "policy" is resolved and now this is just normal renaming over time.

pauljurczak commented 11 years ago

This is probably way too late, but I'm new to Julia: https://groups.google.com/forum/#!topic/julia-users/IvdFNaaV1qA