Closed Qqwy closed 4 years ago
Not sure how dedicated I can be to doing this, but I can certainly write some draft code.
How do you envision alternate comparison functions working? For example, let's use >/2
. Would that count how many numbers occur that are larger than the current one I'm on?
So with an array of [1, 2, 3]
:
1
would have 2 numbers greater than it2
would have 1 number greater than it3
would have no numbers greater than itSince 1
has the most numbers greater than it in the list it is returned.
How do you envision alternate comparison functions working?
Ah! The better term to use would be an 'equality checking function'. The most obvious choice would be to choose &===/2
instead of &==/2
to disambugate between 1
(the integer) and 1.0
(the float). It becomes even more important when other types of inputs are used (which might have wholly different structures but should still be considered 'equal' in certain cases).
As for return value: I think it makes the most sense if Math.Enum.mode
always returns a list, which is empty when the input collection is empty, and contains all values that were the most common (which can obviously be more than one).
So some hypothetical doctest-like examples:
iex> [1, 2, 3, 4, 1] |> Math.Enum.mode()
[1]
iex> [1, 2, 3, 2, 3] |> Math.Enum.mode()
[2, 3]
iex> [] |> Math.Enum.mode()
[]
iex> [1, 1.0, 2, 3] |> Math.Enum.mode() # NOTE: In this case which format will be picked is not guaranteed
[1]
iex> [1, 1.0, 2, 3] |> Math.Enum.mode(&===/1)
[1, 1.0]
Does that make sense? :slightly_smiling_face:
Makes sense!
I've got a quick and dirty implementation that's O(n)
on my fork, but I'm working on your idea of different comparison operators.
Ran into an issue though from the official docs:
[In
Map
s] keys are compared using the exact-equality operator (===/2).
Unless you want to roll your own collectable we might not be able to cleanly implement the 'equality checking function'...
EDIT: My working branch #15
Unless you want to roll your own collectable we might not be able to cleanly implement the 'equality checking function'...
Unless I'm misunderstanding something, the fact that a difference exists between e.g. ==
and ===
means that it would be nice to give users the choice between which equality function to use.
EDIT: Ah, I see the problem you are referring to: The fact that you are using a map inside the Enum.reduce
and (have to) rely on map key equality there. Yes, this indeed is a bit of a problem. :thinking:
I agree, however the Map collectable that comes in the Elixir standard
library by default uses ===/2
.
There's no way for me to specify that it use ==/2
. The only way I can
think that we get the functionality we need is a custom collectable (hard),
or converting decimals to integers (potentially bad).
Feel free to make comments on the draft merge request if that would help.
On Mon, Jan 20, 2020, 11:03 AM Qqwy notifications@github.com wrote:
Unless you want to roll your own collectable we might not be able to cleanly implement the 'equality checking function'...
Unless I'm misunderstanding something, the fact that a difference exists between e.g. == and === means that it would be nice to give users the choice between which equality function to use.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/folz/math/issues/13?email_source=notifications&email_token=ALUI4XZUS4SX7436INVE5Y3Q6XDORA5CNFSM4KI3PHT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJNDKTI#issuecomment-576337229, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALUI4X5BC2LLODNMYBS33MDQ6XDORANCNFSM4KI3PHTQ .
After thinking some more, let's go ahead with the current PR as-is. We can always expand it later to expand its functionality.
It might very well be the case that non-structural comparisons turn the algorithm back into a linearithmic (O(n * log(n))
) algorithm because replacing the map by e.g. a red-black tree would reduce O(1)
-inserts to O(log(n))
again.
This issue is based on a request + PR that was made previously: #10 . Since that PR had a few conceptual flaws, I'll open an issue to track this feature instead.
We should add a
Math.Enum.mode(collection, equality_function \\ &==/2)
function to the library, which returns the most common element.Important considerations:
O(n)
) by usingEnum.reduce
andEnum.max_by
(or anotherEnum.reduce
ifmax_by
does not suit all our needs).2, I'd like to support:
Enum
module do).If someone wants to pick this up, that's wonderful. If not, I'll work on it myself shortly :slightly_smiling_face: .