ManimCommunity / manim

A community-maintained Python framework for creating mathematical animations.
https://www.manim.community
MIT License
24.33k stars 1.7k forks source link

Testing process #97

Closed huguesdevimeux closed 4 years ago

huguesdevimeux commented 4 years ago

I am thinking of a test process, and have some interrogations :

  1. The way to actually test the videos. On #34 , we discussed about something like that :
class Test(Scene): 
    def construct(self): 
        square = Circle()
        self.play(ShowCreation(square))

a = Test()
b = a.get_frame()
d = np.load('test.npy')
print((b == d).all())

This is far from being optimal since only the last frame is tested. My idea is to test the frame every n time, where n would be set in a test config file (something like 0.2 s ? or less). This would require modifying a bit self.play

  1. How to store and compare these frames : We can store every previously-rendered frame (np array) in a bunch of file.npy and compare one by one with the frame-tested corresponding. But my idea is to do something with hashes, like the previously-rendered frames (that are np arrays) are hashed and stored. When testing a frame, it gets hashed and compared with the previously-rendered one. Is this good ?

  2. The formats of the tests: I think we can test each manim module separately, like testing all the creation animations in a test_creation.py, and then do that for every module.

  3. The format of the tests files: My idea was to do something like that : In test_module.py

    
    class Test_AnAnimation(Scene): 
    def construct(self): 
        mobject = Mobject()
        hashes = self.play(AnAnimation(mobject)
        compare_with_prev_rendered(hashes)

class Test_AnotherThing(Scene): def construct(self): ...

class test_module(): Test_AnAnimation(TEST_CONFIG) Test_AnotherThing(TEST_CONFIG) ...



Thoughts ? 
leotrs commented 4 years ago
  1. Sounds good to me. At some point though we also have to think about the time it will take to run the tests. Currently, with two very very very minimal tests, it takes about 40s to run on Linux, and 8min to run on macOS. One advantage of the last-frame tests is speed. (Though I agree they are not optimal.) We could have some last-frame tests and some every-n-frame tests.

  2. Hashes sound good here.

  3. I'm fine with any kind of organization we choose, as long as it's clear where each test should go. Also, it seems like the ones you are thinking of are unit tests (every module, every function, every class, etc). We also need to test videos, command line arguments, end-to-end tests, etc. How to organize those?

  4. This is a good format. One nitpick though: pytest tries to test any class that starts with Test, so in your example, it will think that Test_AnAnimation is actually something it needs to test, when the actual test is test_module() (did you mean for it to be a function?). This is no biggie, but it will create a pytest warning.

huguesdevimeux commented 4 years ago
  1. Oh yea, 8 minutes :/ I will start by doing it with the last frame, then we will see the timing.

  2. What do you mean by unit tests? And for command lines (and end-to-end ? ), we can use subprocessand deal with the output, But concerning the videos, I don't think its a good idea to test videos. For two reasons: It must take a certain time that may not have (8 minutes ...) and since that all of the others functionalities of manim would have previously passed tests, the video should be good.

  3. Yup, I know, I will change that.

leotrs commented 4 years ago

By unit tests I mean to design one test for every single little bit of functionality: every method, member, variable, etc, in isolation. By end-to-end I meant to design tests that would mirror the full usage of manim, from start (writing a Scene script) to finish (rendered video). So these two types of tests are somewhat complementary.

I think we should find a way to test videos however possible, since that's the whole point of manim. Though I agree with you that we can offload a lot of that work into testing other functionalities, right up to the point before we render a video.

I don't think that a long travis run should be a deterrent to writing tests. The workflow to write a PR should be as follows.

  1. dev branches off master and works on PR
  2. dev tests locally using only those tests that are relevant to the PR
  3. when done, dev runs full test suite locally
  4. if 3. succeeds, dev makes PR Only step 4. will take 8min, and only because travis is running tests on different platforms.