Closed Tauseefahmed1451 closed 6 months ago
The inputs are supposed to be binary masks. There are some examples here: https://github.com/jasonyzhang/RayDiffusion/tree/main/examples
But nonetheless, the binary masks are actually just used to extract bounding boxes. If it is easier to acquire a bounding box by hand (as opposed to running an object detector for masks), you can provide bounding boxes in a format similar to this.
Great work, thank you for sharing this. Although it would be great if you can also either write what the masks should be or share an example mask. I understand the most common masks (the black and white) cut masks are probably the required ones but still ambiguous.