mahmoodlab / HIPT

Hierarchical Image Pyramid Transformer - CVPR 2022 (Oral)
Other
509 stars 89 forks source link

Got different results when runing the example "Using the HIPT_4K API" two times #56

Closed weitinging closed 1 year ago

weitinging commented 1 year ago

When I run the demo of "Using the HIPT_4K API" two times, I got two differnet results. This means that the features of the same picture (IPT_4K/image_demo/image_4k.png) are different. Why did this happen? The out of the first run: tensor([[ 1.0272, -1.9435, 0.6787, -2.0988, -0.8154, 0.5248, 0.8696, -0.8446, 0.1395, 0.8047, 0.1398, -0.4195, 0.3957, -1.2475, 0.1240, 0.3093, 0.2297, -0.4640, -0.1228, 0.2283, 1.4605, -0.2860, 1.0855, -0.9648, 0.6011, -0.1053, -0.5265, -0.1709, -0.2928, 0.1758, 0.4308, -1.6282, 0.4507, 0.5317, 0.7081, 1.9234, 1.6555, -1.7919, 0.6333, 1.1621, -0.5198, 0.5400, 0.2892, 0.5696, 1.5454, 0.0902, 1.2077, 0.7093, 0.4484, -1.6740, 0.2767, 1.5463, -0.5062, -1.4946, -0.4388, -1.3272, -1.6025, -1.4270, -0.2920, -0.9610, -0.1785, -1.0690, -0.0909, 1.1373, 2.3152, 0.0128, -0.2553, -0.7900, 0.1847, -1.6724, 0.2292, -0.2205, 1.6045, 1.1768, -0.9727, 0.0040, 0.8964, -0.6493, -0.9621, 0.6649, -2.0764, 0.0941, 0.1114, 0.0874, 1.5188, -0.5708, -0.5191, 0.6575, -0.7133, 1.2263, 0.1106, -0.0203, -0.0630, 0.9492, 1.0342, 0.9832, -0.6675, -1.9364, 0.8946, -0.7704, -0.7437, 1.0926, 1.2189, 2.8511, -0.2247, 2.0800, -0.8410, -1.7034, 1.8220, -1.0245, -0.6447, 0.0150, 0.1905, 0.8309, -0.0789, -0.5855, 0.3611, -0.4197, -1.1567, 1.6380, 1.0924, 0.8179, 0.2739, 0.3366, 1.8718, -1.4705, -0.1677, 0.0622, 0.3796, -0.3879, 0.0889, -0.7625, -1.5185, -0.3900, 0.0644, -0.8013, 0.7280, -2.2695, -0.2772, -0.1249, -0.7562, 0.1551, 1.2951, 1.4519, -0.2379, 0.0507, 0.0469, 0.9950, -0.7161, -0.5830, -0.0865, 0.1499, 1.6882, -1.0193, -0.1839, 2.0649, 1.1826, -1.4405, -0.3786, -0.3022, -0.0145, -0.1275, 1.5060, 0.0552, -0.9059, -0.4446, 0.5374, -0.7501, -1.3557, 0.9116, -1.3676, 1.0581, -0.7810, -0.4877, 0.7617, 0.0312, -1.3845, 0.1594, -0.9211, 0.4436, 0.5221, 0.1048, -0.0579, 1.9291, -0.3553, -1.7979, 0.2487, 0.8929, -2.7074, -1.5217, 0.0345, -1.0574]], device='cuda:0') The out of the second run: tensor([[-0.8617, 1.2333, -0.3320, 0.1566, 0.3043, -0.1035, -1.7042, -1.0360, 0.0841, 1.2058, 0.6106, 0.5801, -1.1089, -0.1767, 0.7558, 0.8873, -0.5098, -0.6752, -0.3064, 0.1744, 1.3652, -1.3844, -0.5107, 1.0535, -1.5274, 0.8271, 0.7165, -0.6407, 0.5759, 0.6355, -1.1873, 0.4897, 1.3801, -0.2763, -0.4875, -1.4082, 1.1015, 0.9396, 1.0487, -0.0815, 1.0360, 2.4232, 0.5266, 1.2903, 0.9402, -0.9511, 0.4216, -1.0932, 1.1696, 1.1283, -0.5590, 0.5287, -1.1801, -0.2546, 0.0378, -0.2582, -1.7058, -1.7925, 2.3835, -0.2244, -0.7829, -0.5465, 1.1077, -0.7507, 0.9361, 0.2205, 0.1639, 0.4639, 0.1691, -0.1604, 1.3497, -0.8076, 0.6755, 0.1011, -1.6349, -0.0602, -0.9681, 1.3290, 0.4226, 0.7562, -1.4071, -0.7310, -1.1855, 2.0997, 1.2576, -1.2203, 0.0806, 0.0747, 0.5141, -0.6748, 1.0230, 0.0330, -0.0424, 0.2749, 0.9671, 0.6085, 0.2992, -1.6514, -0.7195, 2.0988, -1.3269, 1.8263, -0.3118, -0.4210, -1.1496, -0.3970, -0.4480, 1.5925, 1.1100, -0.5222, 0.9687, -1.2781, -1.3437, -1.2233, 1.8803, -0.6620, -0.3003, -0.5592, -0.8920, -0.5249, -1.0030, -1.0557, -0.3928, 0.4380, 2.4794, 1.7709, 1.7506, -0.9539, 0.7304, -1.2020, -0.1607, -0.6108, -0.5856, 0.6002, -0.5986, -1.9799, 0.9542, -0.9195, 2.3000, 0.7303, 0.1312, -2.5691, 0.4511, -1.2753, -0.3263, 0.4889, 0.4481, -2.3728, 1.1700, -0.4633, 0.5822, -0.1131, -1.8267, 0.3590, 1.0784, -0.5423, -0.7253, 0.4574, -0.0575, 1.4478, -0.1208, -0.4602, -1.0904, -0.8529, -1.7636, 0.9147, -0.1743, 0.8406, 0.6429, 0.9972, 0.5170, 1.1177, 0.3966, -1.3500, -0.4085, -0.0680, 0.2702, -1.0105, -0.2218, -0.3109, 0.2562, -1.3499, -0.5872, 0.5333, -0.8533, -0.0155, 0.2798, 0.9288, -1.2377, -0.1704, 0.7664, -0.4247]], device='cuda:0')

Richarizardd commented 1 year ago

Hi @weitinging - can you check that you loaded the model weights correctly?

weitinging commented 1 year ago

Thank you for your suggestion! The model weights didn't load correctly. Thank you!

Qing1Zhong commented 1 year ago

The correct out is:

tensor([[ 0.8896, -2.1130,  0.4011,  1.9388,  2.2679, -0.2919, -2.8318,  3.3083,
         -2.5549, -1.0718,  2.4532,  0.3009, -2.7087,  1.0475,  0.4862, -0.9086,
         -0.6283, -1.4109, -1.7757, -0.4216, -2.2767, -0.0307,  3.0037, -0.7022,
         -2.2229, -2.5973,  4.2466, -2.3519,  1.0857, -0.5460, -2.3129,  2.3446,
         -3.0198, -2.6937,  1.9349, -0.4484,  0.0817, -0.6997, -0.2162, -0.9967,
          2.8000, -5.1581,  2.1064, -0.2916,  1.1988,  0.3805,  4.4717,  4.1056,
          0.2514, -1.3006, -4.8284, -0.1595,  1.9322,  1.7319, -6.2168, -3.1303,
         -2.8676,  3.3709,  0.2881, -2.3995,  2.5332,  1.3674, -0.2954, -0.7680,
          2.0302,  1.5359,  1.7415,  3.3354,  2.2949,  1.5521, -0.8878, -1.7468,
         -2.7638, -2.0117, -0.6663, -3.2415, -0.0986, -0.0882,  2.2837,  4.6560,
         -0.2273,  1.5000,  6.0354,  2.5131, -0.9898, -4.1885, -2.4678,  0.2505,
         -1.7514,  0.2658,  1.1252, -3.6720, -1.4542, -3.0706, -0.1247,  2.9184,
          3.1167,  2.7436, -5.0488, -0.0320, -0.1860, -0.1426,  0.3627,  0.0395,
         -0.6932,  0.4822,  1.9542, -1.8605,  1.4506,  1.4573,  1.0065,  1.4675,
          2.7202,  3.6223,  0.3807,  0.4021, -3.9630,  0.0964, -1.6146, -1.2587,
         -1.3076,  1.1944, -1.5511,  4.7308, -0.0444,  4.8179,  2.0997,  1.1377,
          1.3265,  0.2599,  2.8389, -4.3338,  4.6401,  1.6044,  2.7752,  2.9454,
         -1.8182, -2.2276, -1.6382, -1.5304,  2.0726, -0.6284,  0.8725, -4.1951,
          1.2878, -0.0490, -1.1738,  0.3888,  1.4261,  1.8519,  3.9931,  1.1734,
         -2.4811,  0.5972, -3.3668, -0.0365, -2.2376,  3.1537,  3.0984,  2.0863,
         -1.1236, -0.7329, -0.9192, -3.4123, -1.0592, -1.0717, -2.1983, -3.0891,
         -0.2500, -3.1052, -0.3217, -0.0544,  6.5555,  3.3587, -2.7746, -2.2714,
          2.2318, -2.9227,  2.5831, -4.2082,  2.9219, -0.4439, -2.7881,  0.6900,
         -0.7225, -3.2197,  0.5538, -0.5984,  0.9696, -2.2826, -0.3154,  2.4052]],
       device='cuda:5')

?

And I noticed that there is some extra information after running the demo. Is this normal?

Take key teacher in provided checkpoint dict
Pretrained weights found at /home/xx/HIPT/HIPT_4K/Checkpoints/vit256_small_dino.pth and loaded with msg: _IncompatibleKeys(missing_keys=[], unexpected_keys=['head.mlp.0.weight', 'head.mlp.0.bias', 'head.mlp.2.weight', 'head.mlp.2.bias', 'head.mlp.4.weight', 'head.mlp.4.bias', 'head.last_layer.weight_g', 'head.last_layer.weight_v'])
# of Patches: 196
Take key teacher in provided checkpoint dict
Pretrained weights found at /home/xx/HIPT/HIPT_4K/Checkpoints/vit4k_xs_dino.pth and loaded with msg: _IncompatibleKeys(missing_keys=[], unexpected_keys=['head.mlp.0.weight', 'head.mlp.0.bias', 'head.mlp.2.weight', 'head.mlp.2.bias', 'head.mlp.4.weight', 'head.mlp.4.bias', 'head.last_layer.weight_g', 'head.last_layer.weight_v'])

@Richarizardd @weitinging Looking forward to your reply!🥺🥺🥺🥺🥺🥺

Richarizardd commented 1 year ago

Hi @Qing1Zhong - confirming that this is the answer!

Regarding the extra information - yes this is normal. These heads correspond with the weights for the head (FC layers) used in DINO, and can be safely ignored for feature extraction.

Qing1Zhong commented 1 year ago

Thank you sooooo much for your prompt and detailed response!!! @Richarizardd I believe your project holds significant value for the advancement of this field, and it has provided me with numerous opportunities to learn and deepen my understanding of the related technologies. I will continue to follow your project and look forward to your future breakthroughs. ☺☺☺

scjjb commented 1 year ago

I've had a lot of issues with getting the feature extractor working perfectly, and most of the time I didn't realise it wasn't working. I think it is worth having a very basic file that runs this test and checks the output matches the expected output (I will try to get round to making this a pull request after MICCAI)

scjjb commented 1 year ago

I am working across two computers, on one I get the exact same "out" tensor as is listed here, on another device I get a very slightly different version. Any idea whats going on? I've tried re-downloading the checkpoints so it shouldn't be an error there.

out: tensor([[ 0.8901, -2.1130, 0.4010, 1.9391, 2.2683, -0.2924, -2.8322, 3.3090, -2.5551, -1.0719, 2.4530, 0.3001, -2.7085, 1.0476, 0.4853, -0.9078, -0.6284, -1.4104, -1.7757, -0.4216, -2.2769, -0.0306, 3.0028, -0.7019, -2.2226, -2.5967, 4.2470, -2.3519, 1.0854, -0.5461, -2.3131, 2.3450, -3.0195, -2.6944, 1.9347, -0.4483, 0.0819, -0.6999, -0.2159, -0.9963, 2.7993, -5.1584, 2.1064, -0.2914, 1.1986, 0.3800, 4.4713, 4.1057, 0.2512, -1.3006, -4.8280, -0.1597, 1.9322, 1.7321, -6.2177, -3.1316, -2.8671, 3.3699, 0.2884, -2.3999, 2.5323, 1.3677, -0.2951, -0.7687, 2.0298, 1.5351, 1.7419, 3.3338, 2.2950, 1.5514, -0.8879, -1.7470, -2.7638, -2.0114, -0.6670, -3.2429, -0.0984, -0.0868, 2.2835, 4.6557, -0.2279, 1.5002, 6.0360, 2.5131, -0.9906, -4.1887, -2.4668, 0.2492, -1.7504, 0.2661, 1.1259, -3.6726, -1.4538, -3.0711, -0.1249, 2.9185, 3.1158, 2.7438, -5.0496, -0.0321, -0.1857, -0.1431, 0.3632, 0.0397, -0.6929, 0.4820, 1.9547, -1.8596, 1.4506, 1.4571, 1.0061, 1.4691, 2.7191, 3.6235, 0.3801, 0.4017, -3.9636, 0.0968, -1.6141, -1.2581, -1.3062, 1.1943, -1.5505, 4.7309, -0.0438, 4.8188, 2.0999, 1.1389, 1.3273, 0.2594, 2.8391, -4.3342, 4.6405, 1.6041, 2.7758, 2.9466, -1.8180, -2.2278, -1.6380, -1.5310, 2.0725, -0.6281, 0.8723, -4.1952, 1.2864, -0.0483, -1.1732, 0.3888, 1.4260, 1.8522, 3.9937, 1.1726, -2.4812, 0.5963, -3.3662, -0.0369, -2.2383, 3.1534, 3.0978, 2.0859, -1.1231, -0.7329, -0.9191, -3.4114, -1.0590, -1.0714, -2.1985, -3.0889, -0.2495, -3.1038, -0.3221, -0.0546, 6.5553, 3.3586, -2.7741, -2.2713, 2.2310, -2.9220, 2.5824, -4.2078, 2.9219, -0.4442, -2.7880, 0.6890, -0.7224, -3.2200, 0.5538, -0.5983, 0.9700, -2.2821, -0.3150, 2.4059]], device='cuda:0')