MichiganCOG / Gaze-Attention

Integrating Human Gaze into Attention for Egocentric Activity Recognition (WACV 2021)
MIT License
24 stars 4 forks source link

Reproducing results #4

Open faderani opened 3 years ago

faderani commented 3 years ago

Hi and thanks for the great work.

I have difficulties reproducing the result reported on the EGTEA Gaze+ dataset. I'm using your provided trained weights and following the guide on code usage I get this number on different splits:

test_split1.txt : acc: 36.85, 49.21 / 0:22:27 test_split2.txt :acc: 47.65, 57.44 / 0:15:22 test_split3.txt :acc: 50.41, 60.14 / 0:15:07

How should I reproduce 69.73%?

I'm using parameters as default:

parser.add_argument('--mode', default='test', help='train | test')
parser.add_argument('--crop', type=int, default=224, help='for spatial cropping')
parser.add_argument('--trange', type=int, default=24, help='temporal range')
parser.add_argument('--stride', type=int, default=8, help='pooling stride for gaze prediction')
parser.add_argument('--b', type=int, default=1, help='batch size')
parser.add_argument('--wd', type=float, default=4e-5, help='weight decay')
parser.add_argument('--it1', type=int, default=8000, help='first decay point')
parser.add_argument('--it2', type=int, default=15000, help='second decay point')
parser.add_argument('--iters', type=int, default=18000, help='number of max iterations for training')
parser.add_argument('--lr', type=float, default=0.032, help='learning rate')
parser.add_argument('--ngpu', type=int, default=1, help='number of GPUs to use')
parser.add_argument('--eps', type=float, default=1000, help='epsilon for the gradient estimator')
parser.add_argument('--anneal', type=float, default=1e-3, help='anneal rate for epsilon')

parser.add_argument('--datapath', default='dataset', help='path to dataset')
parser.add_argument('--datasplit', type=int, default=1, help='data split for the cross validation')
parser.add_argument('--weight', default='weights/i3d_iga_best1_base.pt', help='path to the weight file for the base network')
parser.add_argument('--seed', type=int, default=1, help='random seed')
parser.add_argument('--test_sparse', action='store_true', help='whether to test sparsely for fast evaluation')
kylemin commented 3 years ago

I think that the problem is the optical flow. For the optical flow frames, I used this repository: link. If you build it, libpydenseflow.so will be created. Then, you can use this file to extract the flow frames.

I hope it helps.