Open idlehands opened 8 months ago
I have requested @GeoffreyPS also weigh in.
On its face I don't think the library should be extended in this way -- it seems like an invitation to complexity with less gain. I'm keeping an open mind about this, but I don't think the juice is worth the squeeze here.
What circumstances would a caller have two lists of structs to compare that they could not either:
Enum.sort_by/3
For unordered lists, instead of a unique_identifier
could it be open to a caller to pass in a sorting function? We could provide a mapper function (the default being a sort by id
, :asc
). It would be a familiar API, ease implementation here, and make it a little more flexible for developers who find themselves needing that.
If an assertion error occurs deep in a list of comparisons, how would one expose exactly which map-like broke the test? This may be more difficult to implement with functions on sort_by/3
as suggested above and may become easier if the API only exposes unique_identifier
. This is probably the main place where I think having dedicated library functionality could be a desirable improvement over a hand-rolled solution.
@GeoffreyPS thanks for pushing back on the idea. It's very easy to want to bloat the library with every possible use case. That said, I still think this might have a place in the library. Here are my thoughts:
The most important change that we made when we upgraded assert_values_for from being a code snippet that was copied around to a library was that we changed it to stop breaking at the first mismatched value, preferring to instead aggregate all of the information and then present it in a comprehensive way. This allows the error to convey everything needed to understand the issue at the first failure instead of forcing someone to work all the way through each mismatched field.
Before I wrote this RFC, I actually played around with a locally implemented wrapper. I honestly thought I had included it in this RFC, but I was jumping around and apparently totally failed to do so. Here is what I did:
defp assert_values_for_in_list(opts) do
expected_list = opts |> Keyword.get(:expected_list)
actual_list = Keyword.get(opts, :actual_list)
assert length(expected_list) == length(expected_list)
fields = Keyword.get(opts, :fields)
ordered? = Keyword.get(opts, :ordered, true)
identifier_key = Keyword.get(opts, :identifier_key)
for {expected, index} <- Enum.with_index(expected_list) do
actual =
if ordered? do
Enum.at(actual_list, index)
else
Enum.find(actual_list, fn actual -> Map.get(actual, identifier_key) == Map.get(expected, identifier_key) end)
end
assert_values_for(
expected: expected,
actual: actual,
fields: fields
)
end
end
It is for the ordered version of the lists. I would have left well enough alone, using this as a local assertion helper, copying it into codebases as needed, but my use case highlighted right away that there are two ways that a list of things can fail: the order is wrong OR the data is mostly correct but individual fields aren't correct.
With the above code, the the assertions stop as when there is a mismatch and only return the information about the the two map-likes that didn't match, removing any other context.
With the unordered version, there is different, but just as relevant context lost.
Possible Alternative Approach
While writing this, though, I did realize that there is an intermediate step that could be taken: we could add an optional param that tells assert_values_for/1
to return the ExUnit.AssertionError
instead of raising it. It would default to current behavior of just raising. This would allow a custom wrapper to get all of the information and present it. Admittedly, it will be REALLY verbose. This might allow development of additional features in individual codebases without needing to extend the library. Kind of like the original function, perhaps this will help with experimentation and we may end up with something already well defined to add if we decide to.
On that note, it might be worth considering creating a new error that stores the data more granularly (it's done that way before converting to the message currently) and then has a message function that returns things as currently formatted. But I think this would be a nice evolution, not the starting point.
I'm glad you pushed back, @GeoffreyPS! I think this is a discussion worth having.
I see what you mean now in terms of different fail states depending on whether the input list is sorted/unsorted and how the library could better return meaningful errors without need of repeating the test iteratively to suss out what the underlying error truly is. The simplifying solution I suggested--wrapping in a list comprehension and having the caller manage this--still does fall prey to this issue (so maybe simplify tests? 😉).
I think if this library is intended to make complex assertions possible and more understandable, then reducing context lost here makes more sense to me than it did when I first gave the idea consideration.
I really like the idea of providing an option to return the ExUnit error instead of raising. I think this could provides a lot of flexibility on the library and could potentially make the output implementation a bit easier to maintain. It does change the way I see how this library can be used though. Instead of asking "are these two things functionally the same?" we can also ask "what does the presence of these assertion errors mean?"
@GeoffreyPS I think we can be a little hand-wavy about how we see the library. perhaps we mark this as a alpha/beta feature or don't highly publicize it until we are sure that it really is the direction for things.
@lmarlow I think we've agreed to start with adding the ability to return errors instead of raising them. Do you want any input on this before we head in that direction?
I have been playing around with the idea of extending CheckerCab to use the same assertion system for lists of map-likes (map, struct, ecto schema). Overall, I don't think it's too hard, but coming up with a clear API is the challenge. I have thrown out a couple passes on what it could look like, but am open to new ideas as well as different terms.
option 1 - new function
The main question is whether or not it should be a new function or an extension of the behavior of the existing function. I'm leaning towards a new one, though naming, as usual, is hard.
option 2 - add an optional key to existing function
Alternatively, we could add a new key to the existing function. I'm concerned that this might leave it hard to distinguish the behavior.
I would love feedback and any other suggestions.