srvk / how2-dataset

This repository contains code and metadata of How2 dataset
https://srvk.github.io/how2-dataset/
163 stars 17 forks source link

Regarding How2 dataset #26

Closed prasadmagdum04 closed 1 year ago

prasadmagdum04 commented 1 year ago

I downloaded English transcript and English abstractive summaries from given link. But I am not able to use them for transcript summarization. Means i am not able to separate the transcripts from each other. They all are stored in one huge file without any separation between them. Please help me.

ramonsanabria commented 1 year ago

transcripts for sumaritzation are aligned at the video level. If you really want to have sentence-level alignment you can run a force aligner.

Thanks