Open ducha-aiki opened 7 months ago
Oh... No easy fix that i can see, we usually perform our experiments on A100 with 80GB so we never particularly optimized the memory, sorry ^^' With 80GB, we could optimize scenes with 200+ images.
maybe @yocabon would have a better idea?
One solution, kindly suggested by my colleague Romain Bregier, is to have the global alignment running on CPU. Will be slower but will not crash...
Thank you, will try it. And for GPU - is there a way to use multi GPU? I have a server with 8 V100 = 8 x 16Gb, not A100 unfortunately
Hi, I updated the demo script to expose the "scene_graph" parameter. By default, we make all possible pairs, but it explodes when you add many images. Use the "sliding window" or "one reference" method to make fewer pairs, then it should fit in memory.
No we didn't implement multi gpu for the inference.
Oh, that's super useful, thank you!
How do we set the global alignment to run on CPU?
I think maybe this scene = global_aligner(output, device="cpu", mode=mode)
That seems to work ^
@nickponline Just tried 36 images on CPU, now I have the OOM CPU error on a machine with 120 Gb. Is there a way to reduce number of points besides using 224x224 resolution?
RuntimeError: [enforce fail at alloc_cpu.cpp:83] err == 0. DefaultCPUAllocator: can't allocate memory: you tried to allocate 2005401600 bytes. Error code 12 (Cannot allocate memory)
@ducha-aiki do you have a scene covisibility graph? if so, this would greatly reduce the memory usage. On an A100 with 80GB, we are able to optimize scenes with 200+ images when we use 10NN per image.
we had this implemented here https://github.com/naver/dust3r/blob/b6eb95705c2948750283638d5fbb5c12a3a8bf21/dust3r/image_pairs.py#L11 but it didn't make it in the final version ...
I don't have a co-visibility graph, but I can probably run DINOv2, or SALAD to get an estimation. Thank you for the suggestion
Oh... No easy fix that i can see, we usually perform our experiments on A100 with 80GB so we never particularly optimized the memory, sorry ^^' With 80GB, we could optimize scenes with 200+ images.
Good, I replace 3090 laptop with H100 and it works!
Hi,
The performance is really amazing on the few image pairs I have tried. However, when I moved to a bigger scenes (29 images), it crashes with CUDA OOM on 16Gb V100. Any recommendations how can I run it?