JuliaLang / Juleps

Julia Enhancement Proposals
Other
67 stars 24 forks source link

[WIP] Reorganize Search and Find API #22

Closed nalimilan closed 7 years ago

mbauman commented 7 years ago

Very nicely done; this is a great summary. Thanks for doing this!

One thing that would be nice to resolve is if we could remove the need for explicitly typing the predicate function as ::Function without running into ambiguous signatures. I also don't think we necessarily have to constrain ourselves to having the predicate function first. The syntax it enables ( find(A) do elt …) reads to me like it'll only operate on the nonzero elements instead of what it's actually doing: transforming elt to a test to see if its index should be included in the output. In fact, I think the reverse argument ordering might read a bit more naturally: find(A, #= where =# x -> x == 2).

If you're satisfied with this first draft, I'd be inclined to merge it and allow others to propose PRs to specific sections instead of dragging this PR out.

nalimilan commented 7 years ago

One thing that would be nice to resolve is if we could remove the need for explicitly typing the predicate function as ::Function without running into ambiguous signatures.

Right, but I'm not sure it's possible to distinguish, when two arguments are passed, whether they correspond to (pred, A)/(A, pred) or to (A, start), at least if we want to support arbitrary indices (which I think is a good idea in the long term).

Maybe a new function is needed to return a Union of all possible index types supported by a collection, which would be used to restrict the signature. Or conversely a Callable trait could be added to pred, which would never be used for indices.

I also don't think we necessarily have to constrain ourselves to having the predicate function first. The syntax it enables ( find(A) do elt …) reads to me like it'll only operate on the nonzero elements instead of what it's actually doing: transforming elt to a test to see if its index should be included in the output. In fact, I think the reverse argument ordering might read a bit more naturally: find(A, #= where =# x -> x == 2).

At least the current order is consistent with sum and mean. But I agree it feels a bit weird. I'd be inclined to choose whatever solution is simpler to implement, both to avoid ambiguities in signatures, and regarding the deprecation plan.

If you're satisfied with this first draft, I'd be inclined to merge it and allow others to propose PRs to specific sections instead of dragging this PR out.

Actually, I find it more practical to add comments to the PR itself. One it is merged, people need to file issues, which are not automatically grouped together, and which don't automatically get context like comments do. So I'd prefer continuing that way unless you prefer merging it soon.

StefanKarpinski commented 7 years ago

Should we merge this and continue discussion in the issues?

nalimilan commented 7 years ago

As I said, I find commenting on PRs quite practical, but as you prefer.

StefanKarpinski commented 7 years ago

We can always continue to comment on the PR after it's merged.

nalimilan commented 7 years ago

Could people comment on whether they prefer Proposal 1 or Proposal 2? Then I can check what deprecations needs to be done over two releases, and maybe do the (small number) of first steps in time for the 0.6 feature freeze -- so that the full plan can be implemented only via deprecations in 1.0.

nalimilan commented 7 years ago

After a closer examination, it appears the only existing methods which would change their meaning under the two proposals are the search methods. We will have to deprecate them in favor of searchseq/findseq (depending on the chosen proposal), which will prevent using the new meaning of search for some String/Char/Regex combinations for at least one release.

This is not the end of the world, since this would only be temporary (until 1.1), and since the affected methods do not actually exist in a fully generic form currently. If we adopt Proposal 1, only search(::AbstractString, ::AbstractString[, ::Integer]) (deprecated in favor of findseq) will have a different meaning from other (new) search methods. There is not generic way of looking for a sequence of elements inside a collection currently anyway. If we adopt Proposal 2, this is even less of an issue since the search methods returning an iterator do not even exist yet, and may still not be implemented in 1.0. Moreover, the only real ambiguity would be between a predicate and a string, which is easily resolved in favor of the string.

So overall we can probably leave more time for the reflection and only implement this for 1.0.