OdonataResearchLLC / lisp-unit

A Test Framework for Common Lisp in the style of JUnit, designed and implemented with simplicity of use in mind.
108 stars 20 forks source link

Testing destructive functions (e.g., cl:sort) destroys test data #23

Closed BernhardHumm closed 10 years ago

BernhardHumm commented 10 years ago

Consider the following test for the Common Lisp function sort:

(define-test sort-test (assert-equal '(1 2 3 4) (cl:sort '(4 2 3 1) '<)))

Remember: cl:sort is destructive. The following happens to lisp-unit test code:

lisp-unit(11): (test-code 'sort-test) ((assert-equal '(1 2 3 4) (sort '(4 2 3 1) '<))) lisp-unit(12): (run-tests :all) Unit Test Summary | 1 assertions total | 1 passed | 0 failed | 0 execution errors | 0 missing tests

<test-results-db Total(1) Passed(1) Failed(0) Errors(0)>

lisp-unit(13): (test-code 'sort-test) ((assert-equal '(1 2 3 4) (sort '(4) '<))) lisp-unit(14): (run-tests :all) Unit Test Summary | 1 assertions total | 0 passed | 1 failed | 0 execution errors | 0 missing tests

<test-results-db Total(1) Passed(0) Failed(1) Errors(0)>

You see: the invocation of (run-tests) modifies the test data '(4 2 3 1) to '(4) exactly in the way cl:sort destructs '(4 2 3 1). Hence, the second invocation of run-tests fails.

P.S. This issue is similar to issue #19

ThomasHermann commented 10 years ago

The test data is destroyed because you are performing a destructive operation on a literal object. Section 3.7.1 of the CLHS, "The consequences are undefined if literal objects are destructively modified." The lisp-unit code is not modifying the literal test data, your implementation is. When I execute your example on my implementation, I get the follow.

LISP-UNIT 2 > (define-test sort-test
                (assert-equal '(1 2 3 4) (cl:sort '(4 2 3 1) '<)))
SORT-TEST

LISP-UNIT 3 > (test-code 'sort-test)
((ASSERT-EQUAL (QUOTE (1 2 3 4)) (SORT (QUOTE (4 2 3 1)) (QUOTE <))))

LISP-UNIT 4 > (run-tests '(sort-test))
Unit Test Summary
 | 1 assertions total
 | 1 passed
 | 0 failed
 | 0 execution errors
 | 0 missing tests

#<TEST-RESULTS-DB Total(1) Passed(1) Failed(0) Errors(0)>

LISP-UNIT 5 > (test-code 'sort-test)
((ASSERT-EQUAL (QUOTE (1 2 3 4)) (SORT (QUOTE (1 2 3 4)) (QUOTE <))))

LISP-UNIT 6 > (run-tests '(sort-test))
Unit Test Summary
 | 1 assertions total
 | 1 passed
 | 0 failed
 | 0 execution errors
 | 0 missing tests

#<TEST-RESULTS-DB Total(1) Passed(1) Failed(0) Errors(0)>

LISP-UNIT 7 > 

Note that even in my implementation, the test becomes trivial after the first assertion. The proper way to implement this test follows.

LISP-UNIT 7 > (define-test sort-test
                (assert-equal '(1 2 3 4) (cl:sort (list 4 2 3 1) #'<)))
SORT-TEST

LISP-UNIT 8 > (test-code 'sort-test)
((ASSERT-EQUAL (QUOTE (1 2 3 4)) (SORT (LIST 4 2 3 1) (FUNCTION <))))

LISP-UNIT 9 > (run-tests '(sort-test))
Unit Test Summary
 | 1 assertions total
 | 1 passed
 | 0 failed
 | 0 execution errors
 | 0 missing tests

#<TEST-RESULTS-DB Total(1) Passed(1) Failed(0) Errors(0)>

LISP-UNIT 10 > (test-code 'sort-test)
((ASSERT-EQUAL (QUOTE (1 2 3 4)) (SORT (LIST 4 2 3 1) (FUNCTION <))))

LISP-UNIT 11 > 
BernhardHumm commented 10 years ago

Dear Thomas,

thank you very much for your quick reply and the workaround.

However, I still think that the behaviour observerd is not the expected behaviour when studying the lisp unit documentation ("how to use").

Evaluating (cl:sort '(4 2 3 1) '<) in a REPL always returns the correct result (1 2 3 4) A test case evaluates a form automatically and compares it with an expected result. It should behave identically to testing manually but it does not. From my point of view this is a leaking abstraction http://www.joelonsoftware.com/articles/LeakyAbstractions.html that should be avoided. It results from the way test cases are stored internally (in a table) which should be hidden from the user of the testing framework. A different internal implementation (e.g., storing test cases as macros) would result in a different behaviour. So, I would propose to either change the behaviour or explicitly document it.

In any case, thanks again for your quick reply and the workaround.

ThomasHermann commented 10 years ago

This behavior is the expected behavior when studying the lisp-unit documentation in conjunction with the Common Lisp Hyperspec. It is not a leaking abstraction. It has nothing to do with how the test is stored. In fact, the test is not stored in a table. Implementing the test using the LIST function is not a work around, it is the proper way to implement the test.

In lisp, code is data, the body of define-test is parsed to separate tags and documentation from the code. Then, the code, a list of forms, is stored as a list of forms in a unit-test object. It is not manipulated by lisp-unit in any way. The hash-table is only used for bookkeeping and could be replaced with some other organizational structure, such as an alist, without modifying the unit-test object. I have previously experimented with organizing the tests directly in the package namespace, but ran into some unexpected namespace issues with that and had to revert to using a hash-table.

Go back and look at the example I posted using a literal list in my common lisp implementation. It passes your metric that the test behaves identically whether run manually or as part of a unit test. That's because the test case evaluates a form automatically and compares it with an expected result. The only difference to distinguish the behavior you observed and the behavior I observed is that we used different implementations.

And so we get back to CLHS Section 3.7.1, "The consequences are undefined if literal objects are destructively modified." In common lisp, undefined consequences should also be understood to result in implementation specific behavior. To understand what happens when you destructively modify a literal object with SORT, follow the last line of the description of SORT, "In the case of a list, the list is destructively reordered in the same manner as for nreverse." Referring to the NREVERSE description in the CLHS:

For nreverse, sequence might be destroyed and re-used to produce the result. The result might or might not be identical to sequence. Specifically, when sequence is a list, nreverse is permitted to setf any part, car or cdr, of any cons that is part of the list structure of sequence.

Also note the side effects for NREVERSE, "nreverse might either create a new sequence, modify the argument sequence, or both."

What does this mean for a literal list object? A literal object has a very specific meaning in lisp code. It is "referenced directly in a program rather than being computed by the program." This, taken in context with how the list argument to SORT can be modified, explains the difference in behavior observed on your implementation and my implementation.

Using the LIST function to generate a new list every time the form is evaluated rather than using a literal object that gets destructively modified in an undefined way is the proper way to use SORT. You cannot expect defined behavior when destructively modifying a literal object. This is a consequence of the Common Lisp specification and has nothing to do with lisp-unit.

ThomasHermann commented 10 years ago

Here's a quick follow-up to my last comment to demonstrate why the behavior you observed sorting a literal list is a consequence of the implementation and has nothing to do with lisp-unit.

;;; Sort this

(defun literal-sort ()
  "What happens when a literal list is SORTed in a function?"
  (let ((literal-list '(4 2 3 1)))
    (print literal-list)
    (print (sort literal-list #'<))
    (print literal-list)))

SCRATCH 2 > (literal-sort)

(4 2 3 1) 
(1 2 3 4) 
(1 2 3 4) 
(1 2 3 4)

SCRATCH 3 > (literal-sort)

(1 2 3 4) 
(1 2 3 4) 
(1 2 3 4) 
(1 2 3 4)