facebookresearch / r3m

Pre-training Reusable Representations for Robotic Manipulation Using Diverse Human Video Data
https://sites.google.com/view/robot-r3m/
MIT License
292 stars 45 forks source link

About clips data processing #34

Open rbler1234 opened 11 months ago

rbler1234 commented 11 months ago

I really appreciate your wonderful work and nice idea! I'm now faced with some problems when trying to extract ego4d (clip - text) pairs data. The narration.json only has "timestamp_sec", "timestamp_frame" of a specific clip without "start_time" and "end_time", I wonder how do you decide the interval of the clips? My method is reranking the clips narration by "timestamp_sec" , and decide the interval of the i-th clip is just [i-th timestamp_sec,i+1-th timestamp_sec]. Is it correct?Does anyone know about it? Thanks.