uhh-lt / kaldi-tuda-de

Scripts for training general-purpose large vocabulary German acoustic models for ASR with Kaldi.
Apache License 2.0
172 stars 36 forks source link

data hosted at http://dialogplus.lt.informatik.tu-darmstadt.de/ not accessible #11

Closed jukaradayi closed 7 years ago

jukaradayi commented 7 years ago

Hi,

I want to train an acoustic model on german speech, but I'm unable to download any data from http://dialogplus.lt.informatik.tu-darmstadt.de/downloads/speechdata/ .

I managed to get the wavs, but I don't find the source sentence archive for LM building . Do you have a mirror ?

Thanks !

bmilde commented 7 years ago

Our group moved from Darmstadt to Hamburg, that's probably why the link isn't accessible anymore. Thanks for reporting this, I will look into this ASAP.

namxam commented 7 years ago

@bmilde Any update on this?

bmilde commented 7 years ago

I've found the maryfied LM data, but unfortunately not the source texts yet. You can download them here:

https://ltnas1.informatik.uni-hamburg.de:8081/owncloud/index.php/s/jptWYT6dMa3gL1N/download

bmilde commented 7 years ago

Data has now moved to:

http://speech.tools/kaldi_tuda_de/German_sentences_8mil_filtered_maryfied.txt.gz http://speech.tools/kaldi_tuda_de/german-speechdata-package-v2.tar.gz

Sorry for the inconvenience.