Closed heyLQ closed 8 months ago
Hey there,
Sorry for the delay in reply. I have attached the hypotheses dataset we used for the paper in HuggingFace here. https://huggingface.co/datasets/PeacefulData/HyPoradise-v1-GigaSpeech. The generate audio features notebook just adds the audio features (from the Whisper encoder) to this json/csv file and saves it to a .pt checkpoint. You would need the path to the audio file to generate audio features and you can map it using ID tags in HuggingFace datasets.
You can also generate your own json file for your custom dataset by following https://github.com/Srijith-rkr/Whispering-LLaMA/blob/main/data_preparation/To%20generate%20n-best%20hypothesis.ipynb notebook.
Could you please provide "/ibex/user/radhaks/LLMs/LLaMA_7B/LLAMA_EMNLP_DeepSpeed/dataset/inferences/gigaspeech_TRAIN.json", second line in "To generate audio features", thanks!