Open dingdongwang opened 10 months ago
They are in https://github.com/YuanGongND/ltu/blob/0fa0923f9c9d04346486a28477ba69b7d957130c/src/ltu/hf-dev/transformers-main/src/transformers/data/data_collator.py#L615-L616 (similar path for LTU-AS).
They cannot be in finetune.py/finetune_low_resource.py because they have to be loaded on-the-fly otherwise there will be an OOM (we cannot put all audios in memory).
-Yuan
It seems missing the tokenize the audio (from 'input_ids') step both in finetune.py/finetune_low_resource.py of the LTU repo. Where is the detailed coding step for audio tokenization? I saw the 'load_audio()' function in inference_batch.py.