openspeech-team / openspeech

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
https://openspeech-team.github.io/openspeech/
MIT License
670 stars 112 forks source link

Ksponspeech Data- Number of train, validation #192

Closed HwangJae-won closed 7 months ago

HwangJae-won commented 1 year ago

❓ Questions & Help

I'm trying to train using ksponspeach data, is there a reason why you set the number of trains and validations as follows?

Details

I'm curious why you divided it like this because the ratio of validation seems to be too small compared to the train set.

KSPONSPEECH_TRAIN_NUM = 620000 KSPONSPEECH_VALID_NUM = 2545 KSPONSPEECH_TEST_NUM = 6000

upskyy commented 1 year ago
image

It is written in the KsponSpeech paper as follows.

HwangJae-won commented 1 year ago

Thank you. I didn't check the ksponspeech properly 😅 Thank you for your quick response. I'll keep that in mind!