kaldi-asr / kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.
http://kaldi-asr.org
Other
14.15k stars 5.32k forks source link

WSJ setup is a mess #1846

Open danpovey opened 7 years ago

danpovey commented 7 years ago

[I hit send too soon on this; I'm updating the comment.]

I think the time might have come to create an 's5b' version of the WSJ setup. WSJ is the oldest setup and the local scripts are not up to the standard of clarity that we usually expect. Some specific issues:

Part of my motivation is that we'll be doing some RNNLM stuff with this setup (since we have example scripts for older setups) and the scripts need to be cleaner. I don't know if anyone has the time and inclination to work on this?

galv commented 7 years ago

I'm interested, but I have to admit that I am hard-pressed for time right now. I'll see if I can fool with it this weekend.

Some comments:

If someone else wants to do this, definitely don't feel turned away just because I am provisionally interested in it.

danpovey commented 7 years ago

RE why kaldi_lm is used-- I think maybe because we were pruning and it does a better job of pruning than the other ones; also the perplexity is very slightly better than the other ones even for unpruned language models, and the license is better than SRILM. But it is far from ideal in terms of documentation. I don't think there is one LM toolkit that we favor across the board, as they all have drawbacks. I'm not so concerned about that aspect of it, as it's very separable from the other issues, it doesn't really interact with anything. What bothers me more is the structure of the scripts.

galv commented 7 years ago

In case anyone's interested in trying this out, I haven't done any work on this yet.

jtrmal commented 7 years ago

@danpovey, maybe you could mention what would be the subject of cleanup -- only local/ and conf/? Because changing/restructuring steps/ and utils/ would be a major change as it could affect other recipes. y.

On Tue, Sep 5, 2017 at 4:03 PM, Daniel Galvez notifications@github.com wrote:

In case anyone's interested in trying this out, I haven't done any work on this yet.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/1846#issuecomment-327287283, or mute the thread https://github.com/notifications/unsubscribe-auth/AKisX7z4bn-bVo9ICthz0OEPIEzlmCUKks5sfakhgaJpZM4PE_Od .

danpovey commented 7 years ago

Definitely steps/ and utils/ would not be changed; these would be linked to the current location.

On Tue, Sep 5, 2017 at 1:47 PM, jtrmal notifications@github.com wrote:

@danpovey, maybe you could mention what would be the subject of cleanup -- only local/ and conf/? Because changing/restructuring steps/ and utils/ would be a major change as it could affect other recipes. y.

On Tue, Sep 5, 2017 at 4:03 PM, Daniel Galvez notifications@github.com wrote:

In case anyone's interested in trying this out, I haven't done any work on this yet.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/1846#issuecomment-327287283, or mute the thread https://github.com/notifications/unsubscribe-auth/AKisX7z4bn- bVo9ICthz0OEPIEzlmCUKks5sfakhgaJpZM4PE_Od .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/1846#issuecomment-327298558, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu0DLkPpznF0OUpibxOcCj0yQUo-Nks5sfbN3gaJpZM4PE_Od .

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 4 years ago

This issue has been automatically closed by a bot strictly because of inactivity. This does not mean that we think that this issue is not important! If you believe it has been closed hastily, add a comment to the issue and mention @kkm000, and I'll gladly reopen it.