thu-spmi / CAT

A CRF-based ASR Toolkit
Apache License 2.0
324 stars 74 forks source link

Issue on speed of denominator calculation #24

Closed t13m closed 4 years ago

t13m commented 4 years ago

Hi, Thank you for sharing this code first of all. I'm curious on the denominator calculation speed when training, can you share some information on this? Did you try this method on any mandarin dataset? like HKUST or AISHELL? I tried on an in-house mandarin dataset of my lab, the modeling unit is syllable(>1000) by the way, the denominator calculation seems too slow. The generated denominator fst has about 2600 states and 24e5 arcs ( it's a bigram syllable lm ). It takes about 2.6s to calculate on one example with 200 frames on a RTX-2080Ti GPU, and 7.8s for 600 frames. Should I expect this to happen? I think the reason might be the large number of modeling unit, which becomes a cause of a big dense denominator graph. Do you have any way to handle this situation better?

Thanks!

aky15 commented 4 years ago

Thank you for your attention to our work. You are right that the reason is due to the large number of modeling unit. In our experiment, we use phone or character as modeling unit and the size of denominator graph is about 1MB, which enables fast forward and backward calculation. One possible solution is to prune the denominator graph, or do sampling (instead of the summation) to estimate the denominator of the loss.

aky15 commented 4 years ago

On AISHELL dataset, the denominator graph with a 4gram lm has about 14000 states and about 280000 arcs. It takes about 1s to calculate on one example with 200 frames on my (single) Tesla P100 GPU, and 3s for 600 frames.

t13m commented 4 years ago

Is your 4gram lm built on syllable with tone (or without tone), or with initials and finals? I mean how many modeling unit are there in the lm? if you don't mind me asking :)

aky15 commented 4 years ago

For aishell, we use the open source lexicon in http://www.openslr.org/resources/33/resource_aishell.tgz. There are about 200 modeling units.

t13m commented 4 years ago

Thank you, this helps a lot