Closed StefanKarpinski closed 6 years ago
Apologies if this isn't the appropriate place to mention this, but it would be nice to be more consistent with underscores in function names going forward.
No, this is a good place for that. And yes, we should strive to eliminate all names where underscores are necessary :)
For @tkelman's second point, see https://github.com/JuliaLang/julia/issues/19150
There was also a recent Julep regarding the API for find
and related functions: https://github.com/JuliaLang/Juleps/blob/master/Find.md
Should we deprecate put!
and take!
on channels (and maybe do the same for futures) since we have push!
and shift!
on them? Just suggesting removing 2 redundant words in the API.
I am suspicious of shift!
being user friendly. A candidate is fetch!
we already have fetch
which is the non-mutating version of take!
ref #13538 #12469
@amitmurthy @malmaud
Edit: It would even make sense to reuse send
and recv
on channels. (I'm surprised that these are only used for UDPSockets at the moment)
+1 for replacing put!
/take!
with push!
/fetch!
I'll add renaming @inferred
to @test_inferred
.
Double-check that specializations are consistent with the more generic functions, i.e. not something like #20233.
Review all exported functions to check if any can be eliminated by replacing them with multiple dispatch, e.g. print_with_color
The typical pairing is push!
and shift!
when working with a queue-like data structure.
If we're not going to use the typical name pairing for this kind of data structure because we're worried that the operation entails communication overhead that isn't adequately conveyed by those names, then I don't think push!
makes sense either. send
and recv
really might be better.
Maybe double-check that there is general consistency between whether functions take a tuple as the last argument or a vararg.
Perhaps too big for this issue, but it would be good to have consistent rules on when functions should throw errors, and when they should return Nullable
s or Union
s (e.g. parse
/tryparse
, match
, etc.)
No issue too big, @simonbyrne – this is the laundry list.
Btw: this isn't really for specific changes (e.g. renaming specific functions) – it's more about kinds of things we can review. For specific proposed changes, just open an issue proposing that change.
We have a lot of tools like @code_xxx that are paired with underlying functions like code_xxx
Not sure if this is what you're talking about, but see CreateMacrosFrom.jl
Document all exported functions (including doctests)
if this is part of this, then maybe also: remember to label your tests with the issue/pr number. It makes it a lot easier to understand why that test is there. I know how git blame works, but when adding testsets (just to give an example) it's sometimes a bit of a mystery what is being tested, and it would be great if the issue/pr number was always there.
@dpsanders: and exported macros! e.g. @fastmath
has no docstring.
This is very minor, but the string
and Symbol
functions do almost the same thing and have different capitalization. I think symbol
would make more sense.
@amellnik The difference is that Symbol
is a type constructor and string
is a regular function. IIRC we used to have symbol
but it was deprecated in favor of the type constructor. I'm not convinced a change is necessary for this, but if anything I think we should use the String
constructor in place of string
.
if anything I think we should use the String constructor in place of string.
No, they are different functions and shouldn't be merged
julia> String(UInt8[])
""
julia> string(UInt8[])
"UInt8[]"
No, they are different functions and shouldn't be merged
This looks like a situation where string(args...)
should just be deprecated in favor of sprint(print, args...)
, then - having both string
and String
is confusing. We could specialize on sprint(::typeof(print), args...)
to recover any lost performance. Along these lines, it might also make sense to deprecate repr(x)
for sprint(showall, args...)
.
That sounds ok although calling string
to turn something into a string seems pretty standard....
calling string to turn something into a string seems pretty standard
Yes, but that's where the disconnect between String
and string
comes in.
sprint(print, ...)
feels redundant. If we get rid of string
, we can rename sprint
to string
so we get string(print, foo)
and string(showall, foo)
which reads well in my opinion.
This might be a case where consistency is overrated. I think it's fine to have string(x)
for "just give me a string representation of x". If it's going to be more complicated than that, e.g. requiring you to specify which printing function to use, then using another name like sprint
makes sense.
It would also be ok with me to rename String(UInt8[])
to something else, and use String
instead of string
. string
gives us a bit more flexibility in the future to change what type of string we return, but that doesn't seem likely to happen.
Does reinterpret(String, ::Vector{UInt8}
make sense at all, or is this a pun on reinterpret
?
That does seem to make sense.
An issue is that this function is sometimes copying, so that name is somewhat misleading.
True, but strings are supposed to be immutable, so we can probably get away with that.
There is also a String(::IOBuffer)
method, but it looks like that could be deprecated to readstring
.
I've thought about your proposed API change as well, but the interface of string(a, b...)
is that it stringifies and concatenates its arguments, and this would make an annoying gotcha exception for callable first arguments. If we remove concatenation from string
then it could be made to work.
Yes, agreed; consistency and avoiding gotchas is most important.
Noting issues #18326 and #3893 in the "dimension arguments" category.
If I can tack on another item: making sure the behavior of containers of mutables is both documented and consistent.
@JaredCrean2: can you elaborate on what you mean by that?
I certainly hope it doesn't involve making lots of "defensive copies".
For example, if I have an array of mutable types and I call sort
on it, does the returned array point to the same objects as the input array, or does it copy the objects and make the returned array point to them?
The same objects. I'm pretty sure all our collection sorting, getindex, filtering, searching, etc. methods follow this rule, no?
I don't think there's any lack of clarity or consistency on that point – it's always the same objects.
In fact, I think the only standard function where that's not the case is deepcopy
where the whole point is that you get all new objects.
Is that documented somewhere?
No – we could but I'm not sure where it would be best to document it. Why would functions make copies unnecessarily? Where did you get the impression that they might?
Hello. I have not seen i believe any remarks about data serialization.
Soon or later julia programs will be written and run publicly, data will start to stratify sometimes, for years. Data serialization eg. the chain : object to bytes driven by type (maybe over json or ...) has to be built to be time resilient. Thinking about semantic versioning and web api may count too.
Could we expect the serialization for user data to stay close to https://github.com/JuliaLang/julia/blob/v0.5.1/base/serialize.jl ?
Why would functions make copies unnecessarily? Where did you get the impression that they might?
I don't know whether they do or not. As far as I can tell, the behavior is undefined. From @JeffBezanson 's comment, there are people who advocate making defensive copies, which he opposes. So the documentation should address the question of defensive copies somewhere.
You seem to be implying some kind of least-action principle, but depending on the details of the algorithm, what is the "least-action" gets ambiguous. In order to get consistency across the API, I think more specific guidance is required.
@o314: this is an API consistency review issue, I'm not sure how serialization relates.
@JaredCrean2: whether the top-level object is copied or not does certainly need to be documented. What I'm saying is that deeper objects are never copied, except by deepcopy (obviously).
What I'm saying is that deeper objects are never copied, except by deepcopy (obviously).
There was a recent discussion about this in the context of copy
for some of the array wrappers, e.g. SubArray
and SparseMatrixCSC
but also Symmetric
, LowerTriangular
. It seems to me that under the above mentioned policy, copy
would be a noop for such wrapper types. Is the policy you mention the right level of abstraction here? E.g. I think it implies that if Array
s were implemented in Julia (wrapping a buffer), the behavior of copy
on Array
s should then change to a noop.
I'm starting this as a place to leave notes about things to make sure to consider when checking for API consistency in Julia 1.0.
[x] Convention prioritization. Listing and prioritizing our what-comes-first conventions in terms of function arguments for do-blocks, IO arguments for functions that print, outputs for in-place functions, etc (https://github.com/JuliaLang/julia/issues/19150).
[ ] Positional vs keyword arguments. Long ago we didn't have keyword arguments. They're still sometimes avoided for performance considerations. We should make this choice based on what makes the best API, not on that kind of historical baggage (keyword performance issues should also be addressed so that this is no longer a consideration).
[ ] Metaprogramming tools. We have a lot of tools like
@code_xxx
that are paired with underlying functions likecode_xxx
. These should behave consistently: similar signatures, if there are functions with similar signatures, make sure they have similar macro versions. Ideally, they should all return values, rather than some returning values and others printing results, although that might be hard for things like LLVM code and assembly code.[ ] IO <=> file name equivalence. We generally allow file names as strings to be passed in place of IO objects and the standard behavior is to open the file in the appropriate mode, pass the resulting IO object to the same function with the same arguments, and then ensure that the IO object is closed afterwards. Verify that all appropriate IO-accepting functions follow this pattern.
[ ] Reducers APIs. Make sure reducers have consistent behaviors – all take a map function before reduction; congruent dimension arguments, etc.
[ ] Dimension arguments. Consistent treatment of "calculate across this [these] dimension[s]" input arguments, what types are allowed etc, consider whether doing these as keyword args might be desired.
[ ] Mutating/non-mutating pairs. Check that non-mutating functions are paired with mutating functions where it makes sense and vice versa.
[ ] Tuple vs. vararg. Check that there is general consistency between whether functions take a tuple as the last argument or a vararg.
[ ] Unions vs. nullables vs. errors. Consistent rules on when functions should throw errors, and when they should return Nullables or Unions (e.g. parse/tryparse, match, etc.).
[ ] Support generators as widely as possible. Make sure any function that could sensibly work with generators does so. We're pretty good about this already, but I'm guessing we've missed a few.
[ ] Output type selection. Be consistent about whether "output type" API's should be in terms of element type or overall container type (ref #11557 and #16740).
[x] Pick a name. There are a few functions/operators with aliases. I think this is fine in cases where one of the names is non-ASCII and the ASCII version is provided so people can still write pure-ASCII code, but there are also cases like
<:
which is an alias forissubtype
where both names are ASCII. We should pick one and deprecated the other. We deprecatedis
in favor of===
and should do similarly here.[ ] Consistency with DataStructures. It's somewhat beyond the scope of Base Julia, but we should make sure that all of collections in DataStructures have consistent APIs with those provided by Base. The connection in the other direction is that some of those types may inform how we end up designing the APIs in Base since we want them to extend smoothly and consistently.
[ ] NaNs vs. DomainErrors. See https://github.com/JuliaLang/julia/issues/5234 – have a policy for when to do which and make sure it is followed consistently.
[ ] Collection <=> generator. Sometimes you want a collection, sometimes you want a generator. We should go through all our APIs and make sure there's an option for both where it makes sense. Once upon a time, there was a convention to use an uppercase name for the generator version and a lowercase name for the version that's eager and returns a new collection. But no one ever paid any attention to that, so maybe we need a new convention.
[ ] Higher order functions on associatives. Currently some higher order functions iterate over associative collections with signature
(k,v)
– e.g.map
,filter
. Others iterate over pairs, i.e. with signaturekv
, requiring the body to explicitly destructure the pair intok
andv
– e.g.all
,any
. This should be reviewed and made consistent.[x] Convert vs. construct. Allow conversion where appropriate. E.g. there have been multiple issues/questions about
convert(String, 'x')
. In general, conversion is appropriate when there is a single canonical transformation. Conversion of strings into numbers in general isn't appropriate because there are many textual ways to represent numbers, so we need to parse instead, with options. There's a single canonical way to represent version numbers as strings, however, so we may convert those. We should apply this logic carefully and universally.[ ] Review completeness of collections API. We should look at the standard library functions for collections provided by other languages and make sure we have a way of expressing the common operations they have. For example, we don't have a
flatten
function or aconcat
function. We probably should.[ ] Underscore audit.