IBM / Train-Custom-Speech-Model

Create a custom Watson Speech to Text model using specialized domain data
https://developer.ibm.com/patterns/customize-and-continuously-train-your-own-watson-speech-service/
Apache License 2.0
59 stars 42 forks source link

EZdi dataset #94

Closed gguerrab closed 3 years ago

gguerrab commented 3 years ago

Trying to create a custom STT for medical dictation. I was trying to get the dataset from EZdi but is no longer available. https://www.ezdi.com/open-datasets/

Can anyone provide a copy of it or recommend another source?

rhagarty commented 3 years ago

@pvaneck - are you aware of this issue with the data? You can request a download, but doesn't appear to do anything.

pvaneck commented 3 years ago

Hmm, I was not aware. Looks like some issues with the js after form submission. I can try and reach out to some ezDI folks about this.

gguerrab commented 3 years ago

I would really appreciate it. I actually talked to the CEO. He was really nice and connected me to someone in the company who was supposed to help, but I haven’t heard back. Maybe you have better connections and luck.

pvaneck commented 3 years ago

I sent out an email to them last week, and from checking just now, the downloads seem to be working again. So whatever issue was causing the downloads not to work seems to be resolved.

afeltham commented 3 years ago

Can i check i have the correct link for this? The page shown above doesn't seem to have any mention of downloads, but the most recent comment suggests that it is all working. Perhaps i've missed something. Thanks.

fi-da commented 3 years ago

From WayBack Machine: https://d376himta3otg7.cloudfront.net/opendataset/Documents.zip https://d376himta3otg7.cloudfront.net/opendataset/Audio.zip

afeltham commented 3 years ago

Thank you very much @fi-da :)

prsnt558908 commented 3 years ago

@fi-da above links are not working, it would be really great if you can please share again.

Firelytical commented 3 years ago

The files are missing ... please someone re-upload them to google drive or somewhere else ... can't seem to find them anywhere!

Firelytical commented 3 years ago

Btw ... if I have my own audio file ... can I use a random text file to represent the transcription .... I'm just trying to go around the problem ... did anyone test this?

pvaneck commented 3 years ago

Sorry everyone. Just updated the link.