This is done. CNNs didn't work. RNNs worked in a sense. They don't seem to able to generalize to unseen subject, can just do unseen times of the same subjects. Looks like a dead end. Closing this. I think we need to do a raw audio approach at somepoint
This is done. CNNs didn't work. RNNs worked in a sense. They don't seem to able to generalize to unseen subject, can just do unseen times of the same subjects. Looks like a dead end. Closing this. I think we need to do a raw audio approach at somepoint