Some questions when reproducing your paper

Hi author, I am having some problems trying to reproduce the DaiSEE accuracy of 63.9% from your paper, could you please help me, it won't take much of your time.

i have added weighted sampling in the DataLoader in training (not used in testing) and the TCN parameters are the same as in the paper, i see that your video clipping code cuts out around 300 frames per video but i only cut around 150 frames, could this be the main problem ? Is it necessary to detect faces and crop them?
is an optimizer strategy needed and should I use SGD or Adam optimizer?
how many epochs are needed to achieve 63.9% accuracy?
is there anything else that needs to be changed in your current code, please specify? Thank you very much for reading and I look forward to hearing from you! ~ :D

abedidev / ResNet-TCN

Some questions when reproducing your paper #9