Closed szcf-weiya closed 4 years ago
I think the reason for putting all phonemes of a speaker into either train set or test set is to keep the test set independent from the train set, otherwise, the final accuracy tends to be higher since we have known the information of the speaker through some partial phonemes in the train set before performing testing on the test set. https://github.com/szcf-weiya/ESL-CN/blob/3a6336b79389e1eb8b3893f3fa51a607b7997d82/code/Ex.12.5/main.jl#L131-L156 The accuracy is as follows:
julia> accs
3×4 Array{Float64,2}:
0.811805 0.835757 0.851155 0.842601
0.867408 0.875962 0.88195 0.890505
0.867408 0.886228 0.885372 0.893926
and comparison between some contingency tables,
J = 5, K = 1
5×5 Named Array{Int64,2}
Dim1 ╲ Dim2 │ aa ao dcl iy sh
────────────┼────────────────────────
aa │ 112 64 0 0 0
ao │ 68 186 5 4 0
dcl │ 0 4 185 6 0
iy │ 11 39 4 244 13
sh │ 0 0 0 2 222
J = 15, K = 7
5×5 Named Array{Int64,2}
Dim1 ╲ Dim2 │ aa ao dcl iy sh
────────────┼────────────────────────
aa │ 130 46 0 0 0
ao │ 64 198 1 0 0
dcl │ 0 0 192 3 0
iy │ 1 0 8 301 1
sh │ 0 0 0 0 224
And I also plot the (smooth) prototypes,
which can be treated as the extracted features from the original data.