Unexpected behaviour when checking student's wrong solution

Describe the bug We investigated our reference solution by introducing different mistakes an imaginary student could make in the student's implementation. The student is expected to write a function square(a), which takes list a with elements as an input. It should square all elements individually and return a list of squared elements. Our reference solution is:

def square(a):
    res = []
    pybryt.Value(res,
                 name='empty_list',
                 success_message='SUCCESS 1: Great! You start with an empty list.',
                 failure_message='ERROR 1: Hmmm... Did you define an empty list before the loop?')
    for i in a:
        i_squared = i**2
        pybryt.Value(i_squared,
                     name='i_squared',
                     success_message='SUCCESS 2: Amazing! You are computing the squares of individual elements.',
                     failure_message='ERROR 2: Please check if you compute the squares of individual elements?')

        res.append(i_squared)
        pybryt.Value(res,
                     name='appending',
                     success_message='SUCCESS 3: Wow! You are appending the squared elements.',
                     failure_message='ERROR 3: Oops... Please check if you are appending the individual elements?')

    return res

pybryt.Value(square([-555, 13, 57, 0, 1, 2, -44]),
             name='final',
             success_message='SUCCESS 4: Your final solution is correct.',
             failure_message='ERROR 4: The final solution is wrong.')

The student's wrong solution is:

def square(a):
    res = []
    for i in a:
        if i < 0:
            i_squared = -i  # mistake introduced (-i instead of i**2 for negative elements)
        else:
            i_squared = i**2

        res.append(i_squared)

    return res

By checking the student's implementation, PyBryt gives feedback to the student that the solution is wrong, as we expected. However, the feedback messages are puzzling. More precisely:

with pybryt.check(reference(2)):
    square([-555, 13, 57, 0, 1, 2, -44])

REFERENCE: reference-2
SATISFIED: False
MESSAGES:
  - SUCCESS 1: Great! You start with an empty list.
  - SUCCESS 2: Amazing! You are computing the squares of individual elements.
  - ERROR 3: Oops... Please check if you are appending the individual elements?
  - ERROR 4: The final solution is wrong.

Although we moved the testing list away from "the origin" to increase the signal-to-noise ratio by using -555, the student gets SUCCESS 2 message. Is this a bug, or we are missing something because we do not expect 555**2 to be in the student's implementation footprint?

To Reproduce The issue we encountered can be reproduced in 02-code-outside-functions example in https://github.com/marijanbeg/pybryt-examples repository.

Expected behavior We expect the student to receive ERROR 2 instead of SUCCESS 2 message.

microsoft / pybryt

Unexpected behaviour when checking student's wrong solution #64