Data collection for http://www.openslr.org/1/

ynop / audiomate

Python library for handling audio datasets.

https://audiomate.readthedocs.io/

MIT License

131 stars 26 forks source link

Data collection for http://www.openslr.org/1/ #84

Open emzee831 opened 4 years ago

emzee831 commented 4 years ago

Hello, looking for some direction for my first dataset contribution. I've copied the repo to my local computer and have installed the environment.

emzee831 commented 4 years ago

I'm also trying to run test files and I'm getting this error. Traceback (most recent call last): File "test_fluent_speech.py", line 8, in from . import reader_test as rt ImportError: cannot import name 'reader_test' from 'main' (test_fluent_speech.py)

Sorry for all the questions, I'm new to this.

ynop commented 4 years ago

No problem.

For running the tests you can use pytest. When you execute pytest in the main folder all tests are executed. Or you can specify specific test files.

Or you run it using a IDE. For example PyCharm: https://github.com/ynop/audiomate#running-the-test-suite

ynop commented 4 years ago

For the general workflow I normally do:

Create a mock dataset. Equal to the original one but with only a few samples/utterances. Only empty audio files.
Create a test (use some other tests as a reference (There is a base test, so you only have to specifiy what utterances/files/... that are expected)
Implement the reader based on the mock dataset.