princeton-vl / infinigen

Infinite Photorealistic Worlds using Procedural Generation
https://infinigen.org
BSD 3-Clause "New" or "Revised" License
5.14k stars 430 forks source link

Crash when trying to create 50000 frame video #239

Open rossimattia opened 1 month ago

rossimattia commented 1 month ago

Describe the bug

I am trying to render a video with 50000 frames but I get an error in the coarse step.

Steps to Reproduce

python -m infinigen.datagen.manage_jobs --output_folder outputs/my_videos --num_scenes 1 --pipeline_configs local_16GB.gin monocular_video.gin cuda_terrain.gin --pipeline_overrides iterate_scene_tasks.frame_range=[1, 50000] --cleanup big_files --warmup_sec 60000 --configs high_quality_terrain.gin forest.gin video.gin

What version of the code were you using?

Main (commit 830891c3ac988f6be02ded26388b1572a5463de3)

What are your FULL output logs?

error.txt output.txt

Platform

araistrick commented 1 month ago

Hello - creating a 50k frame video almost certainly will not work with default settings. the defaults assume a camera moving at running to walking pace, and that the scene will be represented by one mesh that is valid from all camera views.

The camera speed is relevant because our trajectory generator simply will not find a valid path for 50k frames IE 34 hours of running through the scene. If you want a trajectory this long you will have to code it up. The creatures (birds, fish schools, snakes) all also use this code and wouldnt be able to make a 34hr trajectory.

The "scene will be represented by one mesh" is harder to circumvent. For terrain you can make sure the fine_terrain is in the view_dependent_tasks in stereo.gin, then change view_block_size. For placeholders you can try doing the same with populate, but i have no guarantee it will work.

You can make a 50k frame video if it stays within one region of a scene, but it would also take a long time to render. If you were going to do this id recommend making an 8sec video then re-rendering it at high fps or something.

rossimattia commented 1 month ago

Thank you for your explanation. I am trying to create a multi-view stereo dataset (see issue #237): e.g., 300 frames captured with varying baseline between them. Since the frame rate is fixed to 24fps, I was trying to render enough frames to have a long enough trajectory that then I could sub-sample in order to get the frames with varying baseline. Of course rendering 50k views is an overkill to achieve that. Any suggestion for the multi-view stereo dataset creation?

I noticed that in issue #232 you pointed out to branch rc_1.3.2. I am trying it out now, but in the best scenario I would have 1fps, which is still much more that what I need. Maybe it is possible to control the camera speed?

Also, could you please explain what are these parameters: iterate_scene_tasks.view_block_size and iterate_scene_tasks.cam_block_size? I found some comments in the code, but I still could not figure them out:

view_block_size=1, # how many frames should share each `view_dependent_task`
cam_block_size=None, # how many frames should share each `camera_dependent_task`

For instance, how can multiple frames share view dependent tasks, since frames capture different points of view?

Thank you.

araistrick commented 1 week ago

Hello,

On the latest code you can edit base.gin to change execute_tasks.fps = 24 to whatever suits you, and also add a line of the form AnimPolicyRandomWalkLookaround.speed = ('uniform', 0.5, 1) to set whatever speed of camera in m/s results in an appropriate amount of movement from your task.

Blender doesnt support fractional fps but you can just adjust the camera to move through the scene faster to get a wider baseline.