Closed hungbie closed 5 years ago
Hi @hungbie, thank you for your kind suggestions. Yeah true, it is not fully CNN. If you can understand how the last layers have to be implemented, we update the model.
It is possible to get it to work with variable input without the fully CNN arch. I will try to update the code for the same as it is already in my checklist already.
I implemented it in pytorch but the overall approach is the same. Instead of last maxpool and flatten you do a convolution with number of filers equal to num_classes and the filter size (1,1,4) or whatever so that your final output will have something like num_frames, num_classes, 1, 1 and just reshape to remove the last 2 dimensions
This is a good attempt. especially at creating the generator but there are many problem with this. One of the reason it does not work with variable input is because it does not follow the fully convolutional architecture from the author. And moreover the author (which is not the owner of this repo) does not describe his architecture very well.