Open janniss91 opened 1 year ago
Phone prediction per frame with a kNN let to a ~30% accuracy. For full phones (all frames of a phone combined) using a majority vote, kNN predictions let to a ~73% accuracy.
These values could be attained in spite of the small number of speakers and different dialects.
Use the sample from the TIMIT dataset to perform phone classification.
Read the TIMIT data and make it possible to query for all occurrences of the same phonesFind a simple way of classifying phones (kNN?)for that the phones will be split into short analysis windows and then transformed to the frequency domainonce this is done, kNN could be applied and a majority vote among all windows could be used to determine the phone typepotential problems: