Spatial misalignment between RGB and Depth frames

amrgomaaelhady / SynthoGestures

This repository has the SynthoGestures Framework.

MIT License

4 stars 0 forks source link

Spatial misalignment between RGB and Depth frames #4

Open goutamyg opened 1 week ago

goutamyg commented 1 week ago

Hi,

I exported the RGB and Depth camera captures as jpg files using UE's Sequencer at 30fps. When I overlay the frames from both streams, I could see that there is a strong spatial misalignment between them, as show below.

30fps

I have checked the location and orientation of active RGB and Depth cameras under SettingsActor and they are matching (otherwise, misalignments will be visible in the seat area, mannequin body etc).

Interestingly, as I increase the frame rate for rendering, the extent of misalignment reduces. I am aware that the gesture recognition model should be robust to such misalignments, however, I am curious to know if this issue is caused due to some limitations of the UE Sequencer Renderer, or is it due to the code that generates RGB and depth data. Can you please share your thoughts on this?

amrgomaaelhady commented 6 days ago

I did not use the UE Sequencer before so I am not sure about its limitation, I know that you need 30 fps to match most datasets out there such as the Nvidia one, but maybe increasing the fps during rendering, then down sampling it externally might be already a good solution as you mentioned. However, there is one more thing you could check. Since I was generating the gestures sequentially, and I introduced different variations to act as noise/different user behavior, are you sure that the generated gestures are exactly the same for both cameras and no variation range is added that might already cause the misalignment (even though the start and end position of the gesture in the same). Maybe check that all hand, arm, and finger variations we previously discussed are set to an exact (same) value without any variability and see if the behavior is the same then. Unless you modified the code to capture the gesture from two camera sources in parallel, then this doesn't apply to you.

goutamyg commented 5 days ago

I thought the hand, arm, and finger variation parameter values for each gesture in World Outliner (e.g., the one below)

are applied to both RGB and Depth cameras in parallel. Also, at 240/480fps, the alignment is much better with the same gesture settings.

amrgomaaelhady commented 5 days ago

Yes, they are applied the same for both cameras, but I am not sure if they are loaded once and applied in parallel threads to all Cameras, or they are reinitialized every time when a new "Camera Animation" is played. This would relate more to how UE handle these threads. I think the best way to test that would be to disable all variation ranges and test at low fps to see if the problem persist or not.

goutamyg commented 5 days ago

Sure, I will do that. Thank you.

amrgomaaelhady commented 5 days ago

Great! Thanks a lot! Let me know how did it work for you if that is okay, I would be interested in knowing that.