Open haijing1995 opened 1 year ago
Hey, I'm sorry which results are you referring to?
Hey, I'm sorry which results are you referring to?
the audio only results on lstm and tcn in /docs/report.md
Hey, thanks for noting these results, they were a part of the paper during the development process.
The "HighOrder" features are just the standard mean, median, second order, third order, max, min features extracted from a mel spectrogram.
Btw I don't think these results are all too "good", there were some notable better results we obtained with self-supervised learning such as this paper
Hey, thanks for noting these results, they were a part of the paper during the development process.
The "HighOrder" features are just the standard mean, median, second order, third order, max, min features extracted from a mel spectrogram.
Btw I don't think these results are all too "good", there were some notable better results we obtained with self-supervised learning such as this paper
Thanks for your reply, I have a few more questions,
Each answer of each participant has a different length of time, so the extracted feature(eg. mel-spectrogram) length is also different.
We used a batchsize of 1 for training, which did not add any padding.
Different participants had different numbers of responses. In order to be able to train in batches, how do you unify these two dimensions(not learning x-feature in the paper you mentioned)?
We really did train with a batch size of 1 for most papers since the difference as you mention between samples is substantial. However as a note from us, the dataset is very small for common scientific standards, which leads to a very large variance between most experiments, so do not expect to run our experiments a single time and obtain the same result. The random seed on this dataset has a far larger impact than most "optimization" methods.
Thanks a lot for your help, I will try.
hello, the audio only results in docs seems great, could you tell me what features do you use and model construction?