devonestes / assertions

Helpful assertions for ExUnit
MIT License
142 stars 29 forks source link

Sketch of changeset assertions #9

Closed marick closed 4 years ago

marick commented 4 years ago

We discussed this a bit on the Elixir Forum. The tests for the assertions I've found useful are in https://github.com/marick/crit19/blob/master/test/support/assertions/changeset_test.exs The source, with docstrings, is https://github.com/marick/crit19/blob/master/test/support/assertions/changeset.ex

There are some examples throughout the codebase.

Note: I wouldn't be surprised if more experience makes me want to change the API.

devonestes commented 4 years ago

Wow, you've got a lot of great stuff in there!

My initial thoughts for the stuff I wanted to add were essentially assert_changed and refute_changed (which would be along the lines of your assert_changes and assert_unchanged) and then assert_errors and refute_errors (which are similar to your assert_errors and assert_error_free), so the general functions are pretty much in line. I think the assert_valid and assert_invalid aren't really necessary since they're just aliases for assert changeset.valid? and refute changeset.valid?, and I'd prefer to use the base assertions where possible. My take is that if it's in assertions it shouldn't be just syntactic sugar, and it should have really great error messages, so there's lots of room in there for assert_changed, refute_changed, assert_errors and refute_errors to offer a lot of value in there.

Where I ran into issues with the API was basically how to handle the inputs to give the best possible error messages, but after seeing your functions I think there's a lot of good stuff in there, and it's way better than what I came up with previously!

How would something like this work for y'all?

@spec assert_changed(Changeset.t(), [atom() | {atom(), any()}] :: true | no_return

@spec refute_changed(Changeset.t(), [atom()] :: true | no_return

@spec assert_errors(Changeset.t(), [atom() | {atom(), String.t() | Regex.t()}]) :: true | no_return

@spec refute_errors(Changeset.t(), [atom()]) :: true | no_return

Then you'd call them like this:

asssert_changed(changeset, [:name, :age, address: "123 Berliner Strasse"])

refute_changed(changeset, [:num_posts])

assert_errors(changeset, [:name, :age, address: "Must be in the United States", num_posts: ~r/be positive/])

refute_errors(changeset, [:num_posts])

I think this is really consistent (which I value a lot in API design) and still flexible. It lets users assert that any change was made, or if they want to assert on the specific change, they can do that. Similarly they can assert that there is any error in a key or they can assert on the specific error either with an exact match (if they pass a string) or by comparing a given Regex with the error string.

If there is a failure, we would make sure that we give errors in a way that shows the colored diffs (which I find super helpful). The only thing we wouldn't be able to show colored diffs for is if they pass a regex since you can't really do a colored diff on that (which is a shame, but I get it).

What do you think about it? Would this be enough to replace a good chunk of what you have in your changeset assertions?

marick commented 4 years ago

I'd like the non-throwing cases to return the first argument, in support of chaining. That's actually the reason I have assert_valid, because it lets me have tests that don't actually have to bind a variable:

    test "the construction of cast and derived values" do
      BulkAnimal.creation_changeset(@correct)
      |> assert_valid
      |> assert_changes(species_id: 1,
                        computed_names: ["a", "b", "c"],
                        span: Datespan.customary(@date, @later_date))
    end

I like that better because it's fewer characters that contribute nothing but noise to the test.


Re:

assert_changed(changeset, [:name, :age, address: "123 Berliner Strasse"])

I thought of merging field-alone and field-with-value cases, but thought that might be unidiomatic. (I think at least some of the functions already work for both cases - it's just not advertised or tested.)


My dislike for refute is mild, so I can't really object to refute_changed over assert_unchanged, especially since it seems to be the convention, and I believe in adhering to convention. I've got two reasons for my dislike:

  1. I originally thought it was great that assert and refute have the same number of letters, so things line up nicely. But I noticed when I was looking quickly at tests (remember, that's what I'm optimizing for), I would look at code like this:
assert :ok == Users.delete_password_token(remove.text)
assert {:error, _} = Users.one_token(remove.text)
refute retain.text == remove.text
assert {:ok, _} = Users.one_token(retain.text)

... and scan vertically down the text after the first 7 characters, and only later realize that I hadn't noticed the refute. So I think the affordance* of assert/refute maybe doesn't work.

(*) "Affordance" refers to whether a thing signals how it should be used. You know doors that you pull when you should push? Those have poor affordance: they add a little bit of annoyance to the lives of everyone who uses them.

  1. I have a bias for function calls that match the way a native speaker* would talk. What people say isn't very like "refute_changed"; it's more like "assert_unchanged".

(*) I wonder about this. I've been trying to learn Portuguese for a while, and Portuguese has a different attitude to double-negatives than English does. For example, "Não fiz nada de errado" translates literally as "I didn't do nothing wrong." And since Elixir comes from Brazil...


I agree with the importance of colored diffs.


I think people who haven't used it underestimate the kind of "extended equality" I have in assert_fields. In the assert_field context, that means:

compute_something()
|> assert_field(field1: &odd?/1)

People are so used to thinking of tests as being just about x == y that they don't realize that "equal to an expected value" is just one predicate you can apply to the value under test. I do think making other predicates easily available helps people think about testing in a better way. But, to be honest, I don't know how many of the users of my Clojure testing library take advantage of easy testing of values against predicates.

marick commented 4 years ago

Here's a blog post from someone who made a testing library to help users of his Clojure Datalog library: http://sritchie.github.io/2012/01/22/cascalog-testing-20/

He uses the "extended equality" feature a fair amount by providing test-support functions like produces:

(facts
  query => (produces [[3 10] [1 5] [5 11]])  ;; true
  query => (produces [[1 5] [3 10] [5 11]])) ;; true
devonestes commented 4 years ago

So about the "extended equality", currently assertions has that, just with a different syntax. For example, you can do this today:

      first = %Nested{key: :value, list: [1, 2, 3], map: %{a: :a}}
      second = %Nested{key: :value, list: [1, 3, 2], map: %{"a" => :a}}

      assert_structs_equal(first, second, fn left, right ->
        assert left.key == right.key
        assert_lists_equal(left.list, right.list)
        assert_maps_equal(left.map, right.map, &(&1.a == &2["a"]))
      end)

Of course you could (and probably should) then extract that entire anonymous function into something with a descriptive name, but in general the ability to compose with functions is there, and I find that composing those functions is a more explicit, powerful and consistent way of going about this. In your proposed API, how would you handle composition of those functions in your "extended equality"? This style of function composition served me really well when I was working with a rather large GraphQL API, since I could write functions that asserted equality for a given node in the graph and then compose those together rather easily in all of the other assertions that I wrote, so I found this to be a very consistent and yet still flexible API to work with.

The main reason I don't prefer the API you proposed for this is that I find it confusing. That syntax has one meaning if the value is a not a function (compare that value for equality with the given value using ==) versus if the value is a function (execute this function on the value stored in this key). I value consistency highly in API design, so overloading the concept of a key-value data structure like this I don't really like since it's not consistent.

In general it seems like the things you optimize for (conciseness) are often at odds with the things I optimize for (consistency), so I don't think some of those things you bring up will be a fit for this particular library. For example, I definitely won't include the piping since that's not how the assertions in ExUnit work, and I want to be consistent with those since those are the basic building blocks that everyone has to use. Also, there are many assertions that simply won't work with piping (like assert_receive and refute_receive), so since it can't be applied consistently I don't want to have to have folks remember which can be piped and which can't.

marick commented 4 years ago

OK.

I don't optimize for conciseness. I optimize for readability, where I think the most important thing for readability is elimination of words/tokens that have nothing to do with the purpose of the test. To my mind, text like &(&1.a == &2["a"]) is inadequate because (1) most of the text is noise, and (2) what that function is for is only clear after your eye backtracks to link &1 and &2 to something earlier.

devonestes commented 4 years ago

Don't get me wrong - I really appreciate your input and help here! 😄 I think it's just a philosophical difference in our approach to API design.

It's just that - for me - "readability" isn't a thing I like to optimize for in libraries since a fair amount of it is based on how familiar any given person/user is with a certain syntax/structure. What any one person finds "readable" could be the exact opposite of what another person finds "readable," you know? For application code where you have a small team and can define what "readable" is for them that's all well and good, but for libraries where there are potentially tens of thousands of different users with very different backgrounds and preferences it becomes impossible to optimize for. That's a large part of the reason people who have never worked in Erlang (or Lisp) say those languages are totally unreadable, but after working in them for a while you learn how to scan them since you've become familiar with the syntax. Same with native Japanese/Chinese/Russian speakers - their alphabet is easily readable to them (and arguably easier to read!), but very much unreadable to a non-native speaker/reader who hasn't had years or decades of practice and exposure to that syntax.

Also, that's why I really like using function composition as an API design choice since it gives you the opportunity to name that function so you can hide any of the more difficult things to read or comprehend with a descriptive name, and allows you to reuse these functions easily across the codebase. Like, in real-world code I most likely wouldn't write &(&1.a == &2["a"]), but would instead have something like:

      first = %User{name: "Jim", posts: [%Post{body: "Content"}]}
      second = %User{name: "Jim", posts: [%Post{body: "Content"}]}
      assert_users_equal(left, right)
      assert_lists_equal(left.posts, right.posts, &assert_posts_equal/2)

      def assert_users_equal(left, right) do
        assert_structs_equal(left, right, [:name, :age])
      end

      def assert_posts_equal(left, right) do
        assert_structs_equal(left, right, [:body, :user_id])
      end

Consistency, on the other hand, is a definable and measurable thing, which is why I really strive for it in library API design.

marick commented 4 years ago

I don't think readability is as subjective as you think - there's a reason there are only a few indentation styles for a given language - and I don't think consistency is as objective as you think (unless you define consistency in terms of what's easy to measure, rather than in how humans work), BUT your approach is certainly a fine one to explore and work with.

Fleck (1935) has some important things to say about both readability and consistency, if you read between the line. https://www.goodreads.com/book/show/202695.Genesis_and_Development_of_a_Scientific_Fact

devonestes commented 4 years ago

If you're aware of any good research on the effect of syntax/formatting styles on program comprehension by developers, I'd love to see it! The studies that I've found have shown that the biggest effect on program comprehension is the familiarity of the reader with the given style (basically saying that you can most easily read what you're used to reading), but those were small studies of college students. It's a thing I would really love to see more research on, so whatever you know about I'd love to see!

marick commented 4 years ago

Sorry for the delay. Every study I'm aware of is of the wrong population (students) or too short to matter.

However, let me note something about "the familiarity of the reader with the given style". That's not a problem to be overcome. It's an inevitability to work with. No matter how elegant your library is, a particular team of users will want to use it in a particular way that's... just... wrong (by your standards). And they will find their wrong usage more readable. Because that's what they see every day.

Saying "you shouldn't be able to chain assertions because the base library doesn't allow that" takes choices away from users. You're saying that a team that uses |> excessively(?) in their source shouldn't be able to use it in their tests. I say: if you don't want to chain, don't. But don't stop me from doing what I want.

(Chaining assertions is a small issue, but I wanted to describe my attitude toward library design, which I think differs from yours. Which is OK!)

marick commented 4 years ago

So, for example, the question of "should assert_fields treat a function specially?" is, to me, answered by: if you don't like it, don't use it.

(Note that assumes that the error message from using it when you didn't intend to will be comprehensible enough in the relatively small number of cases where you actually intended to check that the value of a struct was &some_function/1.

devonestes commented 4 years ago

I say: if you don't want to chain, don't. But don't stop me from doing what I want.

Yeah, I'm actually all for this! However, I'm all for it at the team/application level, since the pool of developers working on a given application is going to be relatively small, y'all can really fine-tune what you like as a team and work with that. What you like as a team may be different from what another team likes, and that's totally fine since those other teams won't need to work on your code.

However, for libraries - and especially for libraries that folks will be integrating into their applications fairly heavily - I try and go with sort of the "lowest common denominator" for the language as a whole to try and make sure that it fits with most folks. Surely there will be some percentage of folks that don't like it/don't think it's readable, but by adhering to the same style and general API design of Elixir and its built-in libraries, I think it'll have the best chance of the most folks finding it easy to use and easy to reason about.

Like, I'm all for one team deciding they don't want to use the formatter because they don't like some of the rules in there, but do I think libraries should use the formatter since that's the most widely adopted style in the community. Likewise, if a team decides they want to have different rules for Credo than the default ones then that's great (I just wrote 6 new custom checks for the team I'm currently on this week!) - but I think libraries should just play by the defaults as they're the most common in the community. As those sorts of standards and conventions emerge, I like to try and stick to them in libraries that I write.

marick commented 4 years ago

You're probably right.

I've long said that - at the project level - it's much better to go with the existing style rules than to try to change them or, worse, to ignore them.

As you point out, the same usually applies to community standards.

It does remain an open question about how community standards can be corrected when they turn out to be mistaken. Because even the smartest people make mistakes, and later knowledge can suggest better approaches.

A sort-of example might be Lisp style. Early Lisp style was very imperative (lots of assignments) and very much favored explicit loops http://cl-cookbook.sourceforge.net/loop.html When I learned Lisp, now-common idioms like mapping a function over a collection, or reducing a collection to a value, were... weird. It didn't seem to be the way people thought in the '80s (at least for production code).

Today, Clojure style is different from the Common Lisp style of the '80s. Which raises the question of how can we both accommodate the current dominant style and allow alternative approaches an opportunity to replace the dominant style (if they prove their worth in practice).

But that's not the business of a test assertion library.