JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.73k stars 5.49k forks source link

API consistency review #20402

Closed StefanKarpinski closed 6 years ago

StefanKarpinski commented 7 years ago

I'm starting this as a place to leave notes about things to make sure to consider when checking for API consistency in Julia 1.0.

ararslan commented 7 years ago

Apologies if this isn't the appropriate place to mention this, but it would be nice to be more consistent with underscores in function names going forward.

StefanKarpinski commented 7 years ago

No, this is a good place for that. And yes, we should strive to eliminate all names where underscores are necessary :)

tkelman commented 7 years ago
ararslan commented 7 years ago

For @tkelman's second point, see https://github.com/JuliaLang/julia/issues/19150

ararslan commented 7 years ago

There was also a recent Julep regarding the API for find and related functions: https://github.com/JuliaLang/Juleps/blob/master/Find.md

shashi commented 7 years ago

Should we deprecate put! and take! on channels (and maybe do the same for futures) since we have push! and shift! on them? Just suggesting removing 2 redundant words in the API.

I am suspicious of shift! being user friendly. A candidate is fetch! we already have fetch which is the non-mutating version of take!

ref #13538 #12469

@amitmurthy @malmaud

Edit: It would even make sense to reuse send and recv on channels. (I'm surprised that these are only used for UDPSockets at the moment)

amitmurthy commented 7 years ago

+1 for replacing put!/take! with push!/fetch!

nalimilan commented 7 years ago

I'll add renaming @inferred to @test_inferred.

martinholters commented 7 years ago

Double-check that specializations are consistent with the more generic functions, i.e. not something like #20233.

dpsanders commented 7 years ago

Review all exported functions to check if any can be eliminated by replacing them with multiple dispatch, e.g. print_with_color

StefanKarpinski commented 7 years ago

The typical pairing is push! and shift! when working with a queue-like data structure.

StefanKarpinski commented 7 years ago

If we're not going to use the typical name pairing for this kind of data structure because we're worried that the operation entails communication overhead that isn't adequately conveyed by those names, then I don't think push! makes sense either. send and recv really might be better.

malmaud commented 7 years ago

Maybe double-check that there is general consistency between whether functions take a tuple as the last argument or a vararg.

simonbyrne commented 7 years ago

Perhaps too big for this issue, but it would be good to have consistent rules on when functions should throw errors, and when they should return Nullables or Unions (e.g. parse/tryparse, match, etc.)

StefanKarpinski commented 7 years ago

No issue too big, @simonbyrne – this is the laundry list.

StefanKarpinski commented 7 years ago

Btw: this isn't really for specific changes (e.g. renaming specific functions) – it's more about kinds of things we can review. For specific proposed changes, just open an issue proposing that change.

bramtayl commented 7 years ago

We have a lot of tools like @code_xxx that are paired with underlying functions like code_xxx

Not sure if this is what you're talking about, but see CreateMacrosFrom.jl

tkelman commented 7 years ago
dpsanders commented 7 years ago
pkofod commented 7 years ago

Document all exported functions (including doctests)

if this is part of this, then maybe also: remember to label your tests with the issue/pr number. It makes it a lot easier to understand why that test is there. I know how git blame works, but when adding testsets (just to give an example) it's sometimes a bit of a mystery what is being tested, and it would be great if the issue/pr number was always there.

stevengj commented 7 years ago

@dpsanders: and exported macros! e.g. @fastmath has no docstring.

amellnik commented 7 years ago

This is very minor, but the string and Symbol functions do almost the same thing and have different capitalization. I think symbol would make more sense.

ararslan commented 7 years ago

@amellnik The difference is that Symbol is a type constructor and string is a regular function. IIRC we used to have symbol but it was deprecated in favor of the type constructor. I'm not convinced a change is necessary for this, but if anything I think we should use the String constructor in place of string.

yuyichao commented 7 years ago

if anything I think we should use the String constructor in place of string.

No, they are different functions and shouldn't be merged

julia> String(UInt8[])
""

julia> string(UInt8[])
"UInt8[]"
jrevels commented 7 years ago

No, they are different functions and shouldn't be merged

This looks like a situation where string(args...) should just be deprecated in favor of sprint(print, args...), then - having both string and String is confusing. We could specialize on sprint(::typeof(print), args...) to recover any lost performance. Along these lines, it might also make sense to deprecate repr(x) for sprint(showall, args...).

yuyichao commented 7 years ago

That sounds ok although calling string to turn something into a string seems pretty standard....

ararslan commented 7 years ago

calling string to turn something into a string seems pretty standard

Yes, but that's where the disconnect between String and string comes in.

TotalVerb commented 7 years ago

sprint(print, ...) feels redundant. If we get rid of string, we can rename sprint to string so we get string(print, foo) and string(showall, foo) which reads well in my opinion.

JeffBezanson commented 7 years ago

This might be a case where consistency is overrated. I think it's fine to have string(x) for "just give me a string representation of x". If it's going to be more complicated than that, e.g. requiring you to specify which printing function to use, then using another name like sprint makes sense.

It would also be ok with me to rename String(UInt8[]) to something else, and use String instead of string. string gives us a bit more flexibility in the future to change what type of string we return, but that doesn't seem likely to happen.

TotalVerb commented 7 years ago

Does reinterpret(String, ::Vector{UInt8} make sense at all, or is this a pun on reinterpret?

JeffBezanson commented 7 years ago

That does seem to make sense.

TotalVerb commented 7 years ago

An issue is that this function is sometimes copying, so that name is somewhat misleading.

JeffBezanson commented 7 years ago

True, but strings are supposed to be immutable, so we can probably get away with that.

There is also a String(::IOBuffer) method, but it looks like that could be deprecated to readstring.

StefanKarpinski commented 7 years ago

I've thought about your proposed API change as well, but the interface of string(a, b...) is that it stringifies and concatenates its arguments, and this would make an annoying gotcha exception for callable first arguments. If we remove concatenation from string then it could be made to work.

TotalVerb commented 7 years ago

Yes, agreed; consistency and avoiding gotchas is most important.

JeffBezanson commented 7 years ago

Noting issues #18326 and #3893 in the "dimension arguments" category.

JaredCrean2 commented 7 years ago

If I can tack on another item: making sure the behavior of containers of mutables is both documented and consistent.

StefanKarpinski commented 7 years ago

@JaredCrean2: can you elaborate on what you mean by that?

JeffBezanson commented 7 years ago

I certainly hope it doesn't involve making lots of "defensive copies".

JaredCrean2 commented 7 years ago

For example, if I have an array of mutable types and I call sort on it, does the returned array point to the same objects as the input array, or does it copy the objects and make the returned array point to them?

stevengj commented 7 years ago

The same objects. I'm pretty sure all our collection sorting, getindex, filtering, searching, etc. methods follow this rule, no?

StefanKarpinski commented 7 years ago

I don't think there's any lack of clarity or consistency on that point – it's always the same objects.

StefanKarpinski commented 7 years ago

In fact, I think the only standard function where that's not the case is deepcopy where the whole point is that you get all new objects.

JaredCrean2 commented 7 years ago

Is that documented somewhere?

StefanKarpinski commented 7 years ago

No – we could but I'm not sure where it would be best to document it. Why would functions make copies unnecessarily? Where did you get the impression that they might?

o314 commented 7 years ago

Hello. I have not seen i believe any remarks about data serialization.

Soon or later julia programs will be written and run publicly, data will start to stratify sometimes, for years. Data serialization eg. the chain : object to bytes driven by type (maybe over json or ...) has to be built to be time resilient. Thinking about semantic versioning and web api may count too.

Could we expect the serialization for user data to stay close to https://github.com/JuliaLang/julia/blob/v0.5.1/base/serialize.jl ?

JaredCrean2 commented 7 years ago

Why would functions make copies unnecessarily? Where did you get the impression that they might?

I don't know whether they do or not. As far as I can tell, the behavior is undefined. From @JeffBezanson 's comment, there are people who advocate making defensive copies, which he opposes. So the documentation should address the question of defensive copies somewhere.

You seem to be implying some kind of least-action principle, but depending on the details of the algorithm, what is the "least-action" gets ambiguous. In order to get consistency across the API, I think more specific guidance is required.

StefanKarpinski commented 7 years ago

@o314: this is an API consistency review issue, I'm not sure how serialization relates.

StefanKarpinski commented 7 years ago

@JaredCrean2: whether the top-level object is copied or not does certainly need to be documented. What I'm saying is that deeper objects are never copied, except by deepcopy (obviously).

andreasnoack commented 7 years ago

What I'm saying is that deeper objects are never copied, except by deepcopy (obviously).

There was a recent discussion about this in the context of copy for some of the array wrappers, e.g. SubArray and SparseMatrixCSC but also Symmetric, LowerTriangular. It seems to me that under the above mentioned policy, copy would be a noop for such wrapper types. Is the policy you mention the right level of abstraction here? E.g. I think it implies that if Arrays were implemented in Julia (wrapping a buffer), the behavior of copy on Arrays should then change to a noop.