Load a scene0, the training only generates a gt video, no reconstruction video is generated

qiuqc1 commented 1 week ago

Load a scene0, and then train for 2 hours. When it is finally generated, only a gt video is generated, and no reconstructed video is generated.

The error message is "invail buffer size, packet size 65536 < expected frame_size 3231453" 9c97fb56-0408-4b3b-a5cc-3e057a18835d

The command to run is this： python tools/train.py --config_file configs/omnire.yaml dataset=nuscenes/3cams data.scene_idx=0 data.start_timestep=0 data.end_timestep=-1

ziyc commented 1 week ago

Hi @qiuqc1, I've tested the exact same command recently and it worked well. I think this issue mainly occurs because there's not enough space in RAM (cpu memory) to cache the results. All rendered results (images, masks, etc.) are cached in CPU, and if it reaches the limit, the program gets killed. I encountered this issue before when trying to reconstruct super long sequences (e.g., 7 cameras X 300 frames).

My recommendation for this situation would be: select fewer keys for rendering, only choose the ones you actually want to render. This way, fewer images will be cached, reducing memory usage.

Try to comment out "Dynamic_rgbs", "RigidNodes_rgbs" ... and keep only "rgbs", it should work https://github.com/ziyc/drivestudio/blob/a1fe5d29162e5c4da81bb281c8223b457964a693/tools/train.py#L146-L161

qiuqc1 commented 1 week ago

Hi @qiuqc1, I've tested the exact same command recently and it worked well. I think this issue mainly occurs because there's not enough space in RAM (cpu memory) to cache the results. All rendered results (images, masks, etc.) are cached in CPU, and if it reaches the limit, the program gets killed. I encountered this issue before when trying to reconstruct super long sequences (e.g., 7 cameras X 300 frames).

My recommendation for this situation would be: select fewer keys for rendering, only choose the ones you actually want to render. This way, fewer images will be cached, reducing memory usage.

Try to comment out "Dynamic_rgbs", "RigidNodes_rgbs" ... and keep only "rgbs", it should work

https://github.com/ziyc/drivestudio/blob/a1fe5d29162e5c4da81bb281c8223b457964a693/tools/train.py#L146-L161

Thank you for your quick reply. I just tried eval again and it has been successfully generated.

ziyc / drivestudio

Load a scene0, the training only generates a gt video, no reconstruction video is generated #5