Lhx94As / PHO-LID

PHO-LID: A Unified Model to Incorporate Acoustic-Phonetic and Phonotactic Information for Language Identification
MIT License
19 stars 2 forks source link

seq #2

Open whh07141 opened 7 months ago

whh07141 commented 7 months ago

hi,I want know how to set the T'_i ,I have extract speech representation

Lhx94As commented 7 months ago

Hi,

You can have a pre-processing before or during training (torch reshape), by reshaping the features from (T, feat_dim) into the shape like (T/20, 20, feat_dim) and discarding the left frames (e.g., (101, 20) => (5, 20, 20) )

Hope this can help.