k2-fsa / libriheavy

Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
Apache License 2.0
172 stars 10 forks source link

Data #1

Closed aaaaamilk closed 1 year ago

aaaaamilk commented 1 year ago

Author, can you please provide specific instructions on how to use this dataset? I would greatly appreciate it.

pkufool commented 1 year ago

Please wait for another two days. I'm writing the documents.

aaaaamilk commented 1 year ago

ok,thanks

pkufool commented 1 year ago

FYI: The alignment pipeline: https://github.com/k2-fsa/text_search/tree/master/examples/libriheavy Metadata: http://huggingface.co/datasets/pkufool/libriheavy Paper: https://arxiv.org/abs/2309.08105