icerpc / slicec

The Slice compiler library
Apache License 2.0
13 stars 5 forks source link

Should we Compare Errors by String Instead of by Struct? #295

Open InsertCreativityHere opened 1 year ago

InsertCreativityHere commented 1 year ago

Currently, all our tests look like this:

// Arrange: set everything up and write some Slice.
let slice = "some slice definitions";

// Act: try to parse the slice and get the errors.
let diagnostic_reporter = parse_for_diagnostics(slice);

// Assert: create a list with all the errors we expect to get, then check that the parsing errors equal this list.
let expected = [Error::new(ErrorKind::SomeKindOfError)) ...];
assert_errors!(diagnostic_reporter, expected);

Internally, we store errors in an Error struct, which holds an ErrorKind enumerator describing the error (and usually holds additional information about it).

The issue is that we use struct equality. We directly compare 2 instances of Error to make sure they're the same. This only tests that our internal representation for errors is correct.

All that really matters is what the user will see in their console/logs: the actual error messages. So that's the behavior we should be checking in my opinion. So we should be comparing the structs against the expected error messages.

Real World Example

If you're reviewing the tests, this looks very reasonable. A user messed up the namespace attribute on a module.

ErrorKind::InvalidAttribute(cs_attributes::NAMESPACE.to_owned(), "module".to_owned()),

But in reality, this will emit the following error: attribute 'namespace' cannot be used on 'module' which is totally bogus.

Using struct comparison makes it impossible to tell if the messages are correct or not. This is a non-issue if we compared them to the expected messages instead.

All of our error messages could be broken right now, and we don't have a single test to catch them.

InsertCreativityHere commented 1 year ago

@externl thinks that both things are worth testing. We should keep these tests as is, and add new tests that only check that the error messages are correct.

That's also fine with me. As long as we test the messages somewhere that's good enough for me.

Buttt I think that these checks are simple enough that putting them together in a single test seems easier : v)

ReeceHumphreys commented 1 year ago

The issue is that we use struct equality. We directly compare 2 instances of Error to make sure they're the same. This only tests that our internal representation for errors is correct.

This is not correct. We currently compare the generated error messages against eachother. This makes sure that included Errors and Warnings that take data contain the correct data in their messages

https://github.com/zeroc-ice/icerpc/blob/a4f9f6f1dec33c431872a18c7896e2a3ce7a0294/tests/helpers/macros.rs#L29-L31

For example:

    // Arrange
    let slice1 = "
        encoding = 1;
        module Test;

        class C
        {
        }
    ";
    let slice2 = "
        encoding = 2;
        module Test;

        interface I
        {
            op(c: C);
        }
    ";

    // Act
    let result = parse_from_strings(&[slice1, slice2], None).err().unwrap();

    // Assert
    let expected = Error::new_with_notes(
        ErrorKind::UnsupportedType("C".to_owned(), Encoding::Slice2),
        None,
        vec![Note::new("file encoding was set to Slice2 here:", None)],
    );

The behavior we are testing here is that if you use a class in an encoding 2 file, that we add an ErrorKind::UnsupportedType error to the diagnostic reporter which will contain the data necessary to show the user where their mistake is. The language used in the error message is more of an "implementation detail" in my opinion.

Edit:

Additionally, it should be noted that in our C# tests when we are testing that exceptions are thrown we are not testing the messages or contents of each exception, just that the appropriate exception is thrown. While not exactly the same situation I think the general idea of our tests for Errors and Warnings is to do the same.

InsertCreativityHere commented 1 year ago

A simple question to illustrate my point. What's worse: 1) We correctly emit an "UnsupportedType" error but we incorrectly print filee enc0ding muSt_be Slice37! 2) We incorrectly emit an "FooBar" error but correctly print type 'varint32' is only supported by Slice2

Obviously: 1 is an unacceptable bug, and 2 is perfectly fine. We emitted the correct error message after all. Hence: The most important behavior to test is "the error message is correct". The error type itself is kind of unimportant (in-so-far as generating the correct message).

Additionally: 1) Testing "the error message is correct" practically guarantees the type is correct. If you get file encoding can't be 3 what else could it come from, except UnsupportedEncodingError? 2) Testing "the type is correct" provides no guarantee whether the message (or it's data) is correct. I see an UnsupportedEncodingError, but who knows, maybe it holds "encoding: 2" when 2 is valid?

So testing the error message gets us free validation without any extra tests to maintain! Testing just the error types is fundamentally an incomplete test.

InsertCreativityHere commented 1 year ago

This is not correct. We currently compare the generated error messages against each other.

IMO, this is actually worse than checking via struct equality.

This makes sure that included Errors and Warnings that take data contain the correct data in their messages.

As I understand it, this is incorrect. We aren't checking anything useful. This approach does not: 1) Verify the error message is correct (we never check the message itself, just that it's equal to something else) 2) Verify the error type is correct (we never check the error type itself, just that it's to_string matches something else) 3) Verify the data is correct (we never check the data itself, same as above)

Maybe 2 pieces of data print the same string but are actually different: won't be caught. Maybe a copy/paste error leaves 2 error types with the same message: won't be caught. Maybe the message prints "ḃ̴̝å̶̠ď̵̯ ̶̭͆s̷͈̚ľ̸͇i̴͔̕c̴̖͌ë̷̪́" instead of "helpful syntax error": won't be caught.

Additionally, it should be noted that in our C# tests when we are testing that exceptions are thrown we...

I think it's a good analogy, but I think the focus is "what will the user interact with".

In our C# code, the users will interact with the exception type. they'll do catch (ExceptionType e) {...}, the message might be useful, but the type is what they'll catch.

For the compiler, the users will interact with the error messages they see printed on their console. They won't ever see the error type. So that's the most important thing to test.

InsertCreativityHere commented 1 year ago

Just to clarify, when I talk about testing comparing 'errors by string'. I don't mean: assert_eq!(&error.to_string(), &$expected_errors[i].to_string());

I'm talking about directly comparing it to a string: assert_eq!(&error.to_string(), "the expected error message")

In hindsight I realize both of these are technically comparing errors by string...

ReeceHumphreys commented 1 year ago

One way we could accomplish something like this would be to modify parse_for_diagnostic to return Result<(), Vec<<Diagnostic>>.

Then we could assert whatever we want on the diagnostics. Here is an example:


  // Arrange
  let slice = "
      encoding = 1;
      module Test;
      class C
      {
          a: int32,
          a: string,
      }
  ";

  // Act
  let result = parse_for_diagnostics(slice);

  // Assert
  assert!(result.is_err());

  let diagnostic = &result.unwrap_err()[0];
  assert_eq!(diagnostic.error_kind(), &ErrorKind::Redefinition("a".to_string()));
  assert_eq!(diagnostic.to_string(), "redefinition of `a`");
  assert_eq!(diagnostic.notes(), vec![Note::new(
      "`a` was previously defined here",
      None
  )]);

I still dont think this is gaining much over using the assert_errors! macro.

InsertCreativityHere commented 1 year ago

We'd use a macro to remove the boilerplate; Maybe you pass it two things: 1) An error struct that it does a direct comparison against 2) The string you expect to get from to_stringing it.

But, I still think that checking the strings is sufficient. Since redefinition of 'a' basically guarantees the error_kind == Redefinition. And just checking strings keeps the logic simpler than it is even now.

ReeceHumphreys commented 1 year ago

The current state of this issue is that we should leave the current tests as they stand and if we want to test the "stringification" of error messages then they should be separate tests.

The implement_diagnostic_functions macro requires that each ErrorKind or WarningKind has a message that will be returned by the message method. This message simply gets "bubbled up" all the way to the top level Diagnostic which implements fmt::Display trivially.


impl fmt::Display for Diagnostic {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "{}", self.message())
    }
}

As such, it may not even be worth testing this stringification.

InsertCreativityHere commented 1 year ago

TL;DR Some of our error messages were broken, but the tests showed all green. Hence, the tests are inadequate and we should add more to check the things that slipped through.

As such, it may not even be worth testing this stringification.

This issue is a reactionary one, I only opened it because the stringification was broken.

There's been a couple times now where we accidentally switched things around. For instance, Slice like this:

struct Foo {}
class Bar : Foo {}

would emit an error like:

class Foo cannot inherit from struct Bar     // `Foo` and `Bar` are flipped.

Or sometimes we were just passing in the wrong strings/elements entirely.

There could be others, I only found those with luck.

InsertCreativityHere commented 1 year ago

This message simply gets "bubbled up" all the way to the top level Diagnostic which implements fmt::Display trivially.

What's the point of this trivial re-implementation of fmt::Display? Can we just delete this and use the function directly?