Z-yq / TensorflowASR

一个执着于让CPU\端侧-Model逼近GPU-Model性能的项目,CPU上的实时率(RTF)小于0.1
Apache License 2.0
461 stars 111 forks source link

transducer data process #1

Closed alsm168 closed 4 years ago

alsm168 commented 4 years ago

hello: I want to train rnn-transducer, where to use prepand_blank() of TextFeaturizer? thank you.

Z-yq commented 4 years ago

hi, you can change the code in utils/text_featurizers.py to meet you needs.

line 112 for transducer:

feats=[self.start]+[self.token_to_index[token] for token in tokens]+[self.stop] self.start default is token 's'

self.stop default is token  '/s'

you  can set  'blank_at_zeros = False' that in am_data.yml ,or modify their value.

by the way: wrap_rnnt_loss is more stable.

------------------ 原始邮件 ------------------ 发件人: "Z-yq/TensorflowASR" <notifications@github.com>; 发送时间: 2020年9月25日(星期五) 下午3:57 收件人: "Z-yq/TensorflowASR"<TensorflowASR@noreply.github.com>; 抄送: "Subscribed"<subscribed@noreply.github.com>; 主题: [Z-yq/TensorflowASR] transducer data process (#1)

hello: I want to train rnn-transducer, where to use prepand_blank(0 of TextFeaturizer? thank you.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

alsm168 commented 4 years ago

hi, you can change the code in utils/text_featurizers.py to meet you needs. line 112 for transducer: feats=[self.start]+[self.token_to_index[token] for token in tokens]+[self.stop] self.start default is token 's' self.stop default is token  '/s' you  can set  'blank_at_zeros = False' that in am_data.yml ,or modify their value. by the way: wrap_rnnt_loss is more stable. ------------------ 原始邮件 ------------------ 发件人: "Z-yq/TensorflowASR" <notifications@github.com>; 发送时间: 2020年9月25日(星期五) 下午3:57 收件人: "Z-yq/TensorflowASR"<TensorflowASR@noreply.github.com>; 抄送: "Subscribed"<subscribed@noreply.github.com>; 主题: [Z-yq/TensorflowASR] transducer data process (#1) hello: I want to train rnn-transducer, where to use prepand_blank(0 of TextFeaturizer? thank you. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

thanks for your reply. to use rnnt-loss, blank should be place as 's blank a b c /s' or 'blank s a b c /s' ?

Z-yq commented 4 years ago

sorry , I didn't reply in time。

all right , we should not to insert blank into label, this step will be done in loss function.

------------------ 原始邮件 ------------------ 发件人: "Z-yq/TensorflowASR" <notifications@github.com>; 发送时间: 2020年9月25日(星期五) 晚上6:41 收件人: "Z-yq/TensorflowASR"<TensorflowASR@noreply.github.com>; 抄送: " Demon丶"<641242921@qq.com>;"Comment"<comment@noreply.github.com>; 主题: Re: [Z-yq/TensorflowASR] transducer data process (#1)

hi, you can change the code in utils/text_featurizers.py to meet you needs. line 112 for transducer: feats=[self.start]+[self.token_to_index[token] for token in tokens]+[self.stop] self.start default is token 's' self.stop default is token  '/s' you  can set  'blank_at_zeros = False' that in am_data.yml ,or modify their value. by the way: wrap_rnnt_loss is more stable. … ------------------ 原始邮件 ------------------ 发件人: "Z-yq/TensorflowASR" <notifications@github.com>; 发送时间: 2020年9月25日(星期五) 下午3:57 收件人: "Z-yq/TensorflowASR"<TensorflowASR@noreply.github.com>; 抄送: "Subscribed"<subscribed@noreply.github.com>; 主题: [Z-yq/TensorflowASR] transducer data process (#1) hello: I want to train rnn-transducer, where to use prepand_blank(0 of TextFeaturizer? thank you. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

thanks for your reply. to use rnnt-loss, blank should be place as 'a b c ' or ' a b c ' ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

alsm168 commented 4 years ago

thank you,I got it.😂

---Original--- From: "Z-yq"<notifications@github.com> Date: Sat, Sep 26, 2020 09:33 AM To: "Z-yq/TensorflowASR"<TensorflowASR@noreply.github.com>; Cc: "Author"<author@noreply.github.com>;"luosi"<alsm168@foxmail.com>; Subject: Re: [Z-yq/TensorflowASR] transducer data process (#1)

sorry , I didn't reply in time。

all right , we should not to insert blank into label, this step will be done in loss function.

------------------&nbsp;原始邮件&nbsp;------------------ 发件人: "Z-yq/TensorflowASR" <notifications@github.com&gt;; 发送时间:&nbsp;2020年9月25日(星期五) 晚上6:41 收件人:&nbsp;"Z-yq/TensorflowASR"<TensorflowASR@noreply.github.com&gt;; 抄送:&nbsp;" Demon丶"<641242921@qq.com&gt;;"Comment"<comment@noreply.github.com&gt;; 主题:&nbsp;Re: [Z-yq/TensorflowASR] transducer data process (#1)

hi, you can change the code in utils/text_featurizers.py to meet you needs. line 112 for transducer: feats=[self.start]+[self.token_to_index[token] for token in tokens]+[self.stop] self.start default is token 's' self.stop default is token&nbsp; '/s' you&nbsp; can set&nbsp; 'blank_at_zeros = False' that in am_data.yml ,or modify their value. by the way: wrap_rnnt_loss is more stable. … ------------------&nbsp;原始邮件&nbsp;------------------ 发件人: "Z-yq/TensorflowASR" <notifications@github.com&gt;; 发送时间:&nbsp;2020年9月25日(星期五) 下午3:57 收件人:&nbsp;"Z-yq/TensorflowASR"<TensorflowASR@noreply.github.com&gt;; 抄送:&nbsp;"Subscribed"<subscribed@noreply.github.com&gt;; 主题:&nbsp;[Z-yq/TensorflowASR] transducer data process (#1) hello: I want to train rnn-transducer, where to use prepand_blank(0 of TextFeaturizer? thank you. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

thanks for your reply. to use rnnt-loss, blank should be place as 'a b c ' or ' a b c ' ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.