Closed qiuqc1 closed 1 week ago
Hi @qiuqc1, I've tested the exact same command recently and it worked well. I think this issue mainly occurs because there's not enough space in RAM (cpu memory) to cache the results. All rendered results (images, masks, etc.) are cached in CPU, and if it reaches the limit, the program gets killed. I encountered this issue before when trying to reconstruct super long sequences (e.g., 7 cameras X 300 frames).
My recommendation for this situation would be: select fewer keys for rendering, only choose the ones you actually want to render. This way, fewer images will be cached, reducing memory usage.
Try to comment out "Dynamic_rgbs", "RigidNodes_rgbs" ... and keep only "rgbs", it should work https://github.com/ziyc/drivestudio/blob/a1fe5d29162e5c4da81bb281c8223b457964a693/tools/train.py#L146-L161
Hi @qiuqc1, I've tested the exact same command recently and it worked well. I think this issue mainly occurs because there's not enough space in RAM (cpu memory) to cache the results. All rendered results (images, masks, etc.) are cached in CPU, and if it reaches the limit, the program gets killed. I encountered this issue before when trying to reconstruct super long sequences (e.g., 7 cameras X 300 frames).
My recommendation for this situation would be: select fewer keys for rendering, only choose the ones you actually want to render. This way, fewer images will be cached, reducing memory usage.
Try to comment out "Dynamic_rgbs", "RigidNodes_rgbs" ... and keep only "rgbs", it should work
Thank you for your quick reply. I just tried eval again and it has been successfully generated.
Load a scene0, and then train for 2 hours. When it is finally generated, only a gt video is generated, and no reconstructed video is generated.
The error message is "invail buffer size, packet size 65536 < expected frame_size 3231453"
The command to run is this: python tools/train.py --config_file configs/omnire.yaml dataset=nuscenes/3cams data.scene_idx=0 data.start_timestep=0 data.end_timestep=-1