issues
search
mlcommons
/
peoples-speech
The People’s Speech Dataset
https://mlcommons.org/en/peoples-speech/
Apache License 2.0
98
stars
12
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Rafael/fa
#77
remg1997
opened
1 year ago
1
Wikimedia commons
#76
jeussej
opened
2 years ago
1
How to apply the dataset access
#75
czjbupt
opened
2 years ago
0
Galv/setuptools update
#74
galv
closed
2 years ago
1
Update data_access.md
#73
greg1232
opened
2 years ago
1
CVE-2007-4559 Patch
#72
TrellixVulnTeam
opened
2 years ago
1
Added step 0 setting up SSH on github
#71
Mike-Xie
opened
2 years ago
1
Download is taking over 150 days in US with 1 Gigabit internet
#70
kdonbekci
opened
2 years ago
0
Update CLA bot; add autoblack to exceptions
#69
morphine00
closed
2 years ago
1
Include changes for new tutorial
#68
remg1997
closed
2 years ago
1
Update CLA bot
#67
morphine00
closed
1 year ago
1
Update CLA bot
#66
morphine00
closed
1 year ago
1
Point to latest nemo in dockerfile.
#65
galv
opened
2 years ago
1
Galv/ia download unsupervised
#64
galv
opened
2 years ago
1
Galv/setuptools
#63
galv
opened
2 years ago
1
how to download people speech?(30000hours)
#62
zhu-gu-an
opened
2 years ago
0
Baseline for ASR?
#61
BuaaAlban
opened
2 years ago
0
Nemo training container
#60
JFCeron
closed
2 years ago
1
Opusdec convert to wav creates badheaders
#59
StuartIanNaylor
opened
2 years ago
0
Multilingual Data
#58
will-rice
closed
2 years ago
0
how to use peoples speech dataset?
#57
housebaby
opened
2 years ago
5
data occupy
#56
mozoltov183
opened
2 years ago
0
Test set
#55
JFCeron
opened
3 years ago
1
Test set
#54
JFCeron
closed
3 years ago
1
Daniel/update forced alignment pipeline
#53
galv
closed
2 years ago
1
Nemo training
#52
JFCeron
closed
3 years ago
2
Nemo training
#51
JFCeron
closed
3 years ago
1
Files for data deduplication in text for the paper
#50
Ciroye
opened
3 years ago
1
Example code for Juan.
#49
galv
closed
3 years ago
1
DSAlign cuts words on apostrophe boundaries.
#48
galv
opened
3 years ago
0
Truncating keys to length 100 sometimes causes duplication
#47
galv
opened
3 years ago
1
Things to do before neurips
#46
galv
opened
3 years ago
1
Create two datasets to distribute
#45
galv
opened
3 years ago
0
Data deduplication
#44
Ciroye
opened
3 years ago
1
Apache checker and black formatter into the CI pipeline
#43
Ciroye
closed
3 years ago
1
Juanciro/black-apache
#42
Ciroye
closed
3 years ago
1
Create test_apache.py
#41
Ciroye
closed
3 years ago
1
yamnet WIP
#40
galv
opened
3 years ago
1
Now that we are using black, we don't want to listen to pylint's formatting complaints.
#39
galv
closed
3 years ago
1
Speed up DSAlign
#38
galv
opened
3 years ago
1
Download missing audio files
#37
galv
closed
3 years ago
1
Fix "RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility"
#36
galv
opened
3 years ago
0
WIP: Tar File Format DataSource for Spark
#35
galv
closed
3 years ago
2
Create audio-based language-id system
#34
galv
opened
3 years ago
0
Daniel/spark 3.1 upgrade
#33
galv
closed
3 years ago
1
Make smaller subsets of the data
#32
galv
opened
3 years ago
0
Speed up DSAlign
#31
galv
opened
3 years ago
0
Implement text normalization
#30
galv
opened
3 years ago
1
Output dataset in WebDataset format
#29
galv
opened
3 years ago
1
Run AudioSet pre-trained YAMNet model on the entire dataset
#28
galv
opened
3 years ago
0
Next