Closed pzelasko closed 3 years ago
FYI the data seems correctly constructed to me, also upon checking random cuts (I added a new storage type in Lhotse so I am also making sure the data pipeline works OK):
(in case you're wondering about the dark regions, these cuts are context-less, so they are concatenated just like LibriSpeech, if I can get these working, then I'll start training with the contextual cuts)
(EDIT: actually it's likely I have a bug where a small number of cuts are losing their supervisions; but it's unlikely it triggers the k2 error)
OK, nevermind -- it actually turned out that these missing red boxes with supervisions were causing the problem. I'm not exactly sure what went wrong, but when I fixed the data pipeline, the training seems to run fine for ~200 steps now.
Likely there were NaN's or Inf's in the nnet output.
On Fri, Jul 9, 2021 at 5:58 AM Piotr Żelasko @.***> wrote:
OK, nevermind -- it actually turned out that these missing red boxes with supervisions were causing the problem. I'm not exactly sure what went wrong, but when I fixed the data pipeline, the training seems to run fine for ~200 steps now.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/k2-fsa/snowfall/issues/229#issuecomment-876772623, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLOZAMQMNB4XFJSYFLR3TWYNRDANCNFSM5ABMGIMQ .
That'd be weird because restoring missing supervisions should not have affected the AM output; I was expecting that the supervision_segments
arg to DenseFsaVec
may have contained something unexpected, e.g. skipped an example_idx, or had a zero-duration segment, etc. -- but I didn't pursue this since everything worked.
Hm. Probably it should have failed earlier than that, i.e. we should have detected the problem, whatever it was, earlier.
On Fri, Jul 9, 2021 at 11:36 AM Piotr Żelasko @.***> wrote:
That'd be weird because restoring missing supervisions should not have affected the AM output; I was expecting that the supervision_segments arg to DenseFsaVec may have contained something unexpected, e.g. skipped an example_idx, or had a zero-duration segment, etc. -- but I didn't pursue this since everything worked.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/k2-fsa/snowfall/issues/229#issuecomment-876888037, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLO4NOLM7ILBPFZC7QNLTWZVCZANCNFSM5ABMGIMQ .
I am building the GigaSpeech recipe -- the data part seems is mostly ready, but at the very start of training it fails during intersection -- I'm looking at some things but not sure what's the proper way to debug that, any suggestions?
The recipe is copied from LibriSpeech and practically identical. This is what I'm seeing:
@danpovey @csukuangfj