ls1intum / Artemis

Artemis - Interactive Learning with Automated Feedback
https://docs.artemis.cit.tum.de
MIT License
478 stars 284 forks source link

Artemis removes stack traces from failure output #3198

Open LDAP opened 3 years ago

LDAP commented 3 years ago

Describe the bug

Artemis removes stack traces from failure output.

To Reproduce

  1. Let a test fail with:
    fail(stacktrace);

Expected behavior

Artemis should not magically remove parts of the failure message.

Screenshots

Actual: grafik

Manually replaced the tab characters in the stack trace with spaces (preventing the Pattern to match): grafik

Environment

https://github.com/kit-sdq/Artemis-SAML2-Test-Docker

@dfuchss

b-fein commented 3 years ago

The lines with the stack trace get intentionally removed here https://github.com/ls1intum/Artemis/blob/5f09c792d47a74a7215bd05bd3c909ceecfcc429/src/main/java/de/tum/in/www1/artemis/repository/FeedbackRepository.java#L188 to reduce the failure message to only the most relevant part with the actual message.

If you replace the tab with spaces the lines no longer match \tat and are not removed.

LDAP commented 3 years ago

Thanks @b-fein.

I am the strong opinion that the instructor should actually decide how much information the students get. If I explicitly fail the test with

fail(somemsg)

I would expect that that is the message the students get. Instead, Artemis "magically" alters the string. For somebody not familiar with the internals this seems a lot like a bug (like it did to me).

My proposed solution would be:

What's your opinion on this? @b-fein @krusche @dfuchss

krusche commented 3 years ago

Keep in mind that Artemis is mainly used by programming beginners who might have no clue about programming. There are basically two reasons why we decided to remove stack traces. (By the way: this is not really magical, we just remove everything after the first line, at least this was the initial implementation, which might have become a bit more complex since then).

1) Provide beginner-friendly error messages. In my experience, stack traces and programming language error messages are most of the times hard to understand. The designers of a programming exercise should take some time and try to come up with explanations for error messages that can be understood by the students at their specific level. On the other hand, good feedback should be short and should not reveal the solution. If students receive too much and too detailed feedback, they might not read it.

2) Stack traces might reveal some test internals and might make it easier for students to identify the solution. Let's assume you have a test case with a method name that reveals some information about what the test does. Then, instructors typically do not want to reveal the name in the feedback.

I agree that it might be desirable for the instructor to specify how much details in the feedback are shown, but the question would be on which level. Do you want to specify this for each test case (could be quite time-consuming) or on an exercise or even on a course level? Also keep in mind, that each configuration option makes it more difficult to use Artemis.

LDAP commented 3 years ago

(By the way: this is not really magical, we just remove everything after the first line, at least this was the initial implementation, which might have become a bit more complex since then).

I meant magical in the sense of on the first impression there was no obvious filtering (I could fail with multiline messages) but then Stack-Traces were cut off. Firstly, I assumed a bug in the test.

1. Provide beginner-friendly error messages. In my experience, stack traces and programming language error messages are most of the times hard to understand. The designers of a programming exercise should take some time and try to come up with explanations for error messages that can be understood by the students at their specific level. On the other hand, good feedback should be short and should not reveal the solution. If students receive too much and too detailed feedback, they might not read it.

That is true. But on the other hand the instructor knows at which level his course is and may create his tests accordingly. We like to provide Exceptions because it gives a hint on where a mistake in the program could be.

2. Stack traces might reveal some test internals and might make it easier for students to identify the solution. Let's assume you have a test case with a method name that reveals some information about what the test does. Then, instructors typically do not want to reveal the name in the feedback.

In our tests the exceptions thrown in submissions are caught and the part of the stack trace containing test classes gets cut off (See screenshot). The modified Exception is then rethrown.

I agree that it might be desirable for the instructor to specify how much details in the feedback are shown, but the question would be on which level. Do you want to specify this for each test case (could be quite time-consuming) or on an exercise or even on a course level? Also keep in mind, that each configuration option makes it more difficult to use Artemis.

I think the best solution is to disable filtering in Artemis completely because

dfuchss commented 3 years ago

Hello all,

First of all, the views have advantages and disadvantages as mentioned in all posts.

I would recommend that no matter what solution is chosen for the issue, changes to the original message should be made clear. For example, if deleting the stack trace, a line could be added to the end of the new message "[stack trace omitted]". Then it would have been clear that Artemis removed the stack trace.