Open Algomancer opened 3 days ago
Thank you for the feature request @Algomancer . That is definitely in scope and it's someone we want to provide (this is a duplicate of https://github.com/pytorch/torchcodec/issues/246).
This would just be a matter of returning the clip_start_indices within the FrameBatch.
For consistency with the pts_seconds
and duration_seconds
fields, I think we would be returning the indices of all the frames, not just the clip starts. I think that would still address your use-case as you'd be able to get the clip start index simply with something like clips.indices[:, 0]
🚀 The feature
When sampling using an indexed (or potentially time based) sampler, add an option to additionally return the sampled indices. For example in random sampling, it would return the random indices that were generated. This would just be a matter of returning the clip_start_indices within the FrameBatch.
https://github.com/pytorch/torchcodec/blob/main/src/torchcodec/samplers/_index_based.py#L153C9-L153C27
Motivation, pitch
For some of my tasks, i need to know the relative position of a particular frame within the larger context of the video, for example for positional embedding and RoPE offsetting.
I am about to implemented this in my local fork and would happily upstream it if it is desired.