k2-fsa / icefall

https://k2-fsa.github.io/icefall/
Apache License 2.0
869 stars 283 forks source link

pretrained show worse cer than decode? #142

Open tz301 opened 2 years ago

tz301 commented 2 years ago

I have trained my own model and test one my datasets. The first step is decoded with many params, just like finetune (as decode.py), and save the best params. I use --max-duration=20. And I will save the decode results using the best params on all dataset (not just one).

Then I use this best params to decode wave (as pretrained.py), just one by one on these datasets.

All my dataset show a little worse cer using pretrained.py. Cer comparision below. decode.py: 3.190 12.802 17.995 9.569 14.478 10.299 16.242 7.329 20.695 pretrained.py: 3.203 13.029 18.177 9.662 14.610 10.447 16.463 7.333 20.911

Is this normal? I see the feature extraction is not the same, will this be the reason?

csukuangfj commented 2 years ago

I use --max-duration=20

What if you change it to --max-duration=1 so that there is only one utterance in a batch when using decode.py?

There is a convolutional layer in the conformer model and the padding in a batch may affect the result.

csukuangfj commented 2 years ago

I see the feature extraction is not the same, will this be the reason?

Both feature extractors use the same parameters and should produce the same features.

tz301 commented 2 years ago

I use --max-duration=20

What if you change it to --max-duration=1 so that there is only one utterance in a batch when using decode.py?

There is a convolutional layer in the conformer model and the padding in a batch may affect the result.

I have use --max-duration=1 for decode, but meet MemoryError below. Also print the batch data for error.

err

danpovey commented 2 years ago

Can you try doing export K2_SYNC_KERNELS=1 and rerunning? Error might be earlier.

tz301 commented 2 years ago

Can you try doing export K2_SYNC_KERNELS=1 and rerunning? Error might be earlier.

I export K2_SYNC_KERNELS=1 and run again, see below.

err1

danpovey commented 2 years ago

Hm, I think we're not quite drilling down into the error yet. Looks like the error may have occurred in _k2.index, which goes to C++ code. See if you can find it by running with gdb; you may need to do 'catch throw'. e.g.: gdb --args python3 something.py --opt1 foo ... etc. (gdb) catch throw (gdb) r ...you may have to "continue": (gdb) c if there are previous exceptions that are ignored by the program. once you get to where the exception is raised, see if you can print out any relevant-looking local variables.