piergiaj / pytorch-i3d

Apache License 2.0
978 stars 250 forks source link

Evaluation problem #34

Closed yuanmu97 closed 5 years ago

yuanmu97 commented 5 years ago

According to the reported evaluation results of :

Top classes and probabilities 227 1.0 41.813683 playing cricket 161 1.497162e-09 21.493988 hurling (sport) 48 3.8430797e-10 20.134106 catching or throwing baseball 50 1.54923e-10 19.22559 catching or throwing softball 153 1.1360122e-10 18.915356 hitting baseball 246 8.801088e-11 18.660122 playing tennis 237 2.441537e-11 17.377874 playing kickball 245 1.153184e-11 16.627773 playing squash or racquetball 297 6.1318776e-12 15.996162 shooting goal (soccer) 148 4.391727e-12 15.662385 hammer throw 143 2.2134183e-12 14.9772 golf putting 358 1.6307032e-12 14.671674 throwing discus 166 1.545616e-12 14.618085 javelin throw 256 7.6689886e-13 13.917261 pumping fist 298 5.192929e-13 13.527373 shot put 51 4.2681014e-13 13.331246 celebrating 3 2.7205254e-13 12.880902 applauding 357 1.8356944e-13 12.487498 throwing ball 94 1.6134419e-13 12.358446 dodgeball 349 1.1388308e-13 12.010079 tap dancing

But using the converted 'models/rgb_imagenet.pt', I got the logits results:

(tensor([[14.4069, 14.3028,  8.4564,  8.0338,  7.8161,  6.5047,  6.4402,  6.4330,
           6.1487,  5.7266,  5.6314,  5.4535,  5.2553,  5.1976,  5.1108,  5.0649,
           5.0340,  5.0309,  4.8891,  4.7942]], device='cuda:0',
        grad_fn=<TopkBackward>),
 tensor([[148, 358, 298, 153, 166, 161, 150, 385, 201,  23, 227, 121, 212, 327,
          240, 378,  65, 297,  48, 246]], device='cuda:0'))

The test code I use is:

i3d = InceptionI3d(400, in_channels=3)
i3d.load_state_dict(torch.load('models/rgb_imagenet.pt'))
i3d.eval()
i3d.cuda()

inp = np.load("v_CricketShot_g04_c01_rgb.npy")
inp = inp.reshape(1,3,79,224,224)
inp = torch.from_numpy(inp).cuda()
logits = torch.mean(i3d(inp), dim=2)
torch.topk(logits, 20)

what's the problem here?

thanks!

yuanmu97 commented 5 years ago

The reference results are from https://github.com/deepmind/kinetics-i3d/blob/master/evaluate_sample.py

piergiaj commented 5 years ago

According to these lines: https://github.com/deepmind/kinetics-i3d/blob/master/evaluate_sample.py#L71-L73

The numpy file is saved as [1,time,height,width,channels]

so when you do: inp = inp.reshape(1,3,79,224,224)

that is greatly changing the input data. Reshape to the proper shape, then transpose the axes to get the desired input shape.

Also, make sure the models are the same. Kinetics-400 and 600 have different labels, and the corresponding indices may be different between them.

yuanmu97 commented 5 years ago

Solve it! Thanks for your help~

GayatriPurandharT commented 4 years ago

@openyuanmu may I know how you reshaped the input?
My features are in shape: (1, 74, 224, 224, 3) I get: RuntimeError: Given groups=1, weight of size 64 3 7 7 7, expected input[1, 74, 229, 229, 9] to have 3 channels, but got 74 channels instead Does it mean it should be in the order of: [1,channels,height,width,time] ? Thank you.

destousok commented 4 years ago

Solve it! Thanks for your help~

Hey @yuanmu97 ! did you get the same results as the reported evaluation results that you mentioned? Because even with the transpose my results are a bit different

jun0wanan commented 3 years ago

@openyuanmu may I know how you reshaped the input? My features are in shape: (1, 74, 224, 224, 3) I get: RuntimeError: Given groups=1, weight of size 64 3 7 7 7, expected input[1, 74, 229, 229, 9] to have 3 channels, but got 74 channels instead Does it mean it should be in the order of: [1,channels,height,width,time] ? Thank you.

hi,have you solve this problem?

jun0wanan commented 3 years ago

The reference results are from https://github.com/deepmind/kinetics-i3d/blob/master/evaluate_sample.py

hi, can you tell how to solve it ? hope hope hope to get your reply