Open auzxb opened 1 year ago
Hi @auzxb,
Thank you for reaching out. Can you tell us more about the models and dataset you are using? Do you need additional metadata data to be processed together with audio and video? What is your current approach? Do you use any particular library/libraries in your workflow?
Is this a new feature, an improvement, or a change to existing functionality?
New Feature
How would you describe the priority of this feature request
Must have (e.g. DALI adoption is impossible due to lack in functionality).
Please provide a clear description of problem this feature solves
As a researcher in audio-visual cross-modal learning, I hope to support loading audio and video frames at the same time.
Feature Description
As a researcher in audio-visual cross-modal learning, I hope to support loading audio and video frames at the same time.
Describe your ideal solution
def nvidia.dali.fn.decoders.video(): pass return audio, images, label
Describe any alternatives you have considered
No response
Additional context
No response
Check for duplicates