Rendering logic must be factored into its own class

huguesdevimeux commented 4 years ago

So we have been discussing about refactoring SceneFileWriter into some subclasses, such as a Renderer class and an abstract SceneFIleWriter class.

But this is still well, abstract. Doing this would clean up the code of sceneFileWriter, which is very messy, and will make a lot easier the implementation of new feratures that change the way that the videos are rendered, such as LiveStreaming (or even ModernGL, who knows ^^). Some basic stuff would be subclassed, such as open_movie_pipe, etc.

First of all, we have to discuss before any concrete consideration, on what do we want each class to do. Now, the process is that scene update the frame, which call Camera that capture the mobjects on camera. Afterward, scene call (again..) SceneFileWriter and ask it to write the frame. (No interaction is made between Camera and SceneFileWriter which is .. weird).

What we could do is : Scene process the mobjects (i.e sorting them, etc blahblah) and only calls Renderer to update the frame. Then, Renderer will see what needs to be done depending on some parameters (eg, skip_animation, caching) and then will ONLY call for camera to capture mobjects. Camera will then return the frame, and then Renderer will call the instance of SceneFileWriter that will write to the actual output stream. So, scene does not have any direct interaction with SceneFileWriter, only with Renderer. We would also create a new directory to sort this stuff apart from manim/scene.

leotrs commented 4 years ago

Thanks for opening this. I'm overdue for working on this.

Here's my proposal. I think that each class should preferably do one thing only.

FileWriter should receive an array of pixels and convert it into an image or video. Subclasses of it would specify how this is done. For example FFMPEGFileWriter would use ffmpeg to make the conversion. (Most of the current SceneFileWriter would become FFMPEGFileWriter.) We could have other subclasses such as WebGLFileWriter or StreamFileWriter. But the point is that all these classes take an array of pixels as input and send this array to some other place, perhaps by reformatting it in the process.
Camera should keep track of, well, camera settings. Things like position, rotation, movement, zoom, etc. Note that all of these can be represented as a linear transformation. So the Camera class in this proposition would be fairly bare but very math-heavy, implementing an API that includes camera.zoom(), camera.move(), camera.rotate(), and so on. (This is where my proposal disagrees with yours.)
Scene should keep track Mobjects and Animations, as it does now, AND it should provide a way of rendering the whole thing. (Yes, in this case I am proposing that a single class does two different things, not one like all the others, but this is mainly for user convenience and because otherwise we would have to rewrite the whole library from scratch.) In order to do the second task ("render the whole thing"), Scene will have to send the mobjects and the animations, AND the Camera settings, to the renderer. (I think it's fine that the Renderer/Camera do not speak to SceneFileWriter at all. They are doing different things after all.)
Renderer should take the information inside of a scene (namely, a collection of Mobjects and Animations and camera settings) and turn it into an array of pixels. Currently, this is done in Camera, but that would change. Subclasses would be CairoRenderer or WebGLRenderer etc.

So, the dataflow would look like this:

User creates a Scene, adds some mobjects and animations.
Scene has a Camera object that the user can access as self.camera.zoom(). This camera object will convert that call into a linear transformation (this is just a 3x3 matrix).
User calls manim and manim renders the Scene. This process iterates the following for each frame: a. take all the mobjects and the camera's linear transformation and send them to the Renderer. b. the Renderer will use all of that information and make an array of pixels. c. the array of pixels is sent to the FileWriter, which transforms it and puts it somewhere. d. the mobjects and animations are updated. Go back to a.
The end.

Note: this refactor of Camera would eliminate the need for things such as MovingCamera (because all Cameras will have a move() method), and also the need for ZoomableScene, which makes no sense to me and violates the principle that each class should preferably do only one thing.

Note: in the case of a GL backend, I think that Camera, Renderer and FileWriter will all need to access the same underlying GL instance, as it will surely handle things like camera zoom/translation/rotation a lot better than we could by manually keeping a matrix, AND it would also handle the file writing itself. I think this is all possible by storing this GL instance in the Scene object and exposing it to all other classes.

This is what @eulertour and I discussed some weeks ago.

Final note: if left to my own devices, I would honestly make Scene just keep track of mobjects and animations, and make yet another class VideoMaker that handles the loop in step number 3 above. The user would never have to touch VideoMaker, it would only be used when calling manim from the command line. In this case, Scene would truly only have one job, and the only class communicating with all other classes would be VideoMaker. But anyway.

eulertour commented 4 years ago

I'm currently working on this refactor. The idea is like this Untitled document

Then we can swap the CairoRenderer for a ThreeJsRenderer, ModernGLRenderer, LivestreamRenderer as desired.

Some things I wanted to say about @leotrs response:

At least the default camera should just have a position, direction, and possibly a zoom factor, since not all renderers can make use of a matrix. It should be up to the renderer to create and use a matrix if it wants one.
FFmpeg is the only way we know of to create video; OpenGL can't do it on its own, so I don't think FFmpegFileWriter or OpenGLFileWriter make sense as class names.
About this

in the case of a GL backend, I think that Camera, Renderer and FileWriter will all need to access the same underlying GL instance

When using OpenGL, the GL context would be the renderer, so there wouldn't be a substantial difference in the above diagram.

leotrs commented 4 years ago

At least the default camera should just have a position, direction, and possibly a zoom factor, since not all renderers can make use of a matrix. It should be up to the renderer to create and use a matrix if it wants one.

That's what I meant :)

FFmpeg is the only way we know of to create video; OpenGL can't do it on its own, so I don't think FFmpegFileWriter or OpenGLFileWriter make sense as class names.

I misunderstood you in the past, I thought OpenGL could do that on its own.

When using OpenGL, the GL context would be the renderer, so there wouldn't be a substantial difference in the above diagram.

Agreed!

huguesdevimeux commented 4 years ago

@eulertour This seems awesome, Just some thought about LiveStreamRenderer. This does not make any sense as the livestream feature that could be implemented uses Cairo in a normal way, it's just the way videos are outputted that changes (partial movies files go to a stream). Therefore, do you plan on adding LiveStreamingWriter (and other subclasses, maybe) as well?

Personnaly, I think it could be useful if we had more than one feature that touches SceneFileWriter, but if we have only LiveStream .. It could be overkill. But still good to have nevertheless.

eulertour commented 4 years ago

The renderer controls the way the video is outputted as well.

But if streaming with Cairo uses the same rendering logic as normal use there's no need to make a class for that.

leotrs commented 3 years ago

We already have a renderer class. As I see no clear To Do item here other than that, I'm closing this. Please reopen if I missed anything.

ManimCommunity / manim

Rendering logic must be factored into its own class #524