ManimCommunity / manim

A community-maintained Python framework for creating mathematical animations.
https://www.manim.community
MIT License
24.35k stars 1.7k forks source link

Rendering logic must be factored into its own class #524

Closed huguesdevimeux closed 3 years ago

huguesdevimeux commented 4 years ago

So we have been discussing about refactoring SceneFileWriter into some subclasses, such as a Renderer class and an abstract SceneFIleWriter class.

But this is still well, abstract. Doing this would clean up the code of sceneFileWriter, which is very messy, and will make a lot easier the implementation of new feratures that change the way that the videos are rendered, such as LiveStreaming (or even ModernGL, who knows ^^). Some basic stuff would be subclassed, such as open_movie_pipe, etc.

First of all, we have to discuss before any concrete consideration, on what do we want each class to do. Now, the process is that scene update the frame, which call Camera that capture the mobjects on camera. Afterward, scene call (again..) SceneFileWriter and ask it to write the frame. (No interaction is made between Camera and SceneFileWriter which is .. weird).

What we could do is : Scene process the mobjects (i.e sorting them, etc blahblah) and only calls Renderer to update the frame. Then, Renderer will see what needs to be done depending on some parameters (eg, skip_animation, caching) and then will ONLY call for camera to capture mobjects. Camera will then return the frame, and then Renderer will call the instance of SceneFileWriter that will write to the actual output stream. So, scene does not have any direct interaction with SceneFileWriter, only with Renderer. We would also create a new directory to sort this stuff apart from manim/scene.

leotrs commented 4 years ago

Thanks for opening this. I'm overdue for working on this.

Here's my proposal. I think that each class should preferably do one thing only.

So, the dataflow would look like this:

  1. User creates a Scene, adds some mobjects and animations.
  2. Scene has a Camera object that the user can access as self.camera.zoom(). This camera object will convert that call into a linear transformation (this is just a 3x3 matrix).
  3. User calls manim and manim renders the Scene. This process iterates the following for each frame: a. take all the mobjects and the camera's linear transformation and send them to the Renderer. b. the Renderer will use all of that information and make an array of pixels. c. the array of pixels is sent to the FileWriter, which transforms it and puts it somewhere. d. the mobjects and animations are updated. Go back to a.
  4. The end.

Note: this refactor of Camera would eliminate the need for things such as MovingCamera (because all Cameras will have a move() method), and also the need for ZoomableScene, which makes no sense to me and violates the principle that each class should preferably do only one thing.

Note: in the case of a GL backend, I think that Camera, Renderer and FileWriter will all need to access the same underlying GL instance, as it will surely handle things like camera zoom/translation/rotation a lot better than we could by manually keeping a matrix, AND it would also handle the file writing itself. I think this is all possible by storing this GL instance in the Scene object and exposing it to all other classes.

This is what @eulertour and I discussed some weeks ago.

Final note: if left to my own devices, I would honestly make Scene just keep track of mobjects and animations, and make yet another class VideoMaker that handles the loop in step number 3 above. The user would never have to touch VideoMaker, it would only be used when calling manim from the command line. In this case, Scene would truly only have one job, and the only class communicating with all other classes would be VideoMaker. But anyway.

eulertour commented 4 years ago

I'm currently working on this refactor. The idea is like this Untitled document

Then we can swap the CairoRenderer for a ThreeJsRenderer, ModernGLRenderer, LivestreamRenderer as desired.

Some things I wanted to say about @leotrs response:

When using OpenGL, the GL context would be the renderer, so there wouldn't be a substantial difference in the above diagram.

leotrs commented 4 years ago
  • At least the default camera should just have a position, direction, and possibly a zoom factor, since not all renderers can make use of a matrix. It should be up to the renderer to create and use a matrix if it wants one.

That's what I meant :)

  • FFmpeg is the only way we know of to create video; OpenGL can't do it on its own, so I don't think FFmpegFileWriter or OpenGLFileWriter make sense as class names.

I misunderstood you in the past, I thought OpenGL could do that on its own.

When using OpenGL, the GL context would be the renderer, so there wouldn't be a substantial difference in the above diagram.

Agreed!

huguesdevimeux commented 4 years ago

@eulertour This seems awesome, Just some thought about LiveStreamRenderer. This does not make any sense as the livestream feature that could be implemented uses Cairo in a normal way, it's just the way videos are outputted that changes (partial movies files go to a stream). Therefore, do you plan on adding LiveStreamingWriter (and other subclasses, maybe) as well?

Personnaly, I think it could be useful if we had more than one feature that touches SceneFileWriter, but if we have only LiveStream .. It could be overkill. But still good to have nevertheless.

eulertour commented 4 years ago

The renderer controls the way the video is outputted as well.

But if streaming with Cairo uses the same rendering logic as normal use there's no need to make a class for that.

leotrs commented 3 years ago

We already have a renderer class. As I see no clear To Do item here other than that, I'm closing this. Please reopen if I missed anything.