Jakobovski / free-spoken-digit-dataset

A free audio dataset of spoken digits. An audio version of MNIST.
626 stars 248 forks source link

2 line dataset access via hub #36

Closed verbose-void closed 3 years ago

verbose-void commented 3 years ago

i put your dataset in hub as an official dataset so users can access it much faster than before. it was kinda hard to navigate the code for new users so this should make it a bit easier 😎

2 lines of code to access:

import hub
ds = hub.load("hub://activeloop/spoken_mnist")

# check out the first spectrogram and who spoke it!
import matplotlib.pyplot as plt
plt.imshow(ds.spectrograms[0].numpy())
plt.title(f"{ds.speakers[0].data()} spoke {ds.labels[0].numpy()}")
plt.show()

available tensors can be shown by printing dataset:

print(ds)
# prints: Dataset(path='hub://activeloop/spoken_mnist', tensors=['spectrograms', 'labels', 'audio', 'speakers'])
verbose-void commented 3 years ago

if you would like me to make any revisions/remove this dataset, please let me know and i will take care of it!

Jakobovski commented 3 years ago

Thanks for the PR! These are some nice changes. I left a comment, please fix and then i will merge. Thanks

Jakobovski commented 3 years ago

Thanks