yuhuixu1993 / PC-DARTS

PC-DARTS:Partial Channel Connections for Memory-Efficient Differentiable Architecture Search
436 stars 108 forks source link

search accuracy for imagenet #29

Closed kunalmessi10 closed 4 years ago

kunalmessi10 commented 4 years ago

I am getting 31% validation accuracy after the search directly on imagenet, is it the same with what you guys got? If not, can you tell me what is the validation accuracy I should expect at the end of train_search_imagenet?

yuhuixu1993 commented 4 years ago

@kunalmessi10 ,it is almost the same. accuracy during search is not that important. You may evaluate the searched aechitexture.

kunalmessi10 commented 4 years ago

By same you mean same as 31%?

yuhuixu1993 commented 4 years ago

@kunalmessi10,yes,though not exactly the same

kunalmessi10 commented 4 years ago

okay close enough is fine with me, and the searched architecture for imagenet that you report in your paper is based on random sampling of channels or channel shuffle in train_search_imagenet.py, also the architecture the search outputted didn't any skip connections, any particular reason for that?

kunalmessi10 commented 4 years ago

Also, one more question, is it possible to search on a subset of imagenet which has lesser classes but the same number of images per class?

yuhuixu1993 commented 4 years ago

@kunalmessi10,Hi,the result is based on channel shuffle. Such situation(without skipconnect) sometimes happens. You can use the skipconnect regulation in Pdarts or search more times. We are now also working on solving the problem. Yes, I think it is ok to search on less classes.

kunalmessi10 commented 4 years ago

okay, thanks!

kunalmessi10 commented 4 years ago

I'm guessing Pdarts is also from your research group, can you point me out the skip connect regulation part in that paper's code?

yuhuixu1993 commented 4 years ago

@kunalmessi10 ,https://github.com/chenxin061/pdarts/blob/05addf3489b26edcf004fc4005bbc110b56e0075/train_search.py#L407 ,while this part of code can not be used directly. In pdarts situation,the skipconnect is too many so they delete the skip with lowest prob. While in our case it is easier,you can choose the two biggest skip in the choosed edge to replace the original operation. For example, the choose edge is (1,0) (1,2)(1,3)(1,4) in this 8 edges, you choos the biggest two skip and replace the corresponding operation. I will add this code in our further version.

kunalmessi10 commented 4 years ago

Does adding a drop path after skip connect as described in pdarts helps?

On Mon, 20 Jan, 2020, 8:19 PM Yuhui Xu, notifications@github.com wrote:

@kunalmessi10 https://github.com/kunalmessi10https://github.com/chenxin061/pdarts/blob/05addf3489b26edcf004fc4005bbc110b56e0075/train_search.py#L407 ,while this part of code can not be used directly. In pdarts situation,the skipconnect is too many so they delete the skip with lowest prob. While in our case its more easy,you can choose the two biggest skip in the choosed edge to replace the original operation. I will add this code in our further version.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/yuhuixu1993/PC-DARTS/issues/29?email_source=notifications&email_token=AGZYXXW5KWMMJSE2LYWNUT3Q6W2WVA5CNFSM4KJDRPJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJM4AAQ#issuecomment-576307202, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGZYXXX5ZA7FTTPIZKE2373Q6W2WVANCNFSM4KJDRPJA .

yuhuixu1993 commented 4 years ago

Does adding a drop path after skip connect as described in pdarts helps?

I add skip just to follow the mobile settings(<600 flops). And skip sometimes helps in the performance of the network. The network without skip may harder to train.

kunalmessi10 commented 4 years ago

Okay, thanks for the help, I'll try as you suggested

yuhuixu1993 commented 4 years ago

@kunalmessi10 , wish you could find you satisfied architectures. Could you please show the architecture you found? I just want to see if the code runs ok

kunalmessi10 commented 4 years ago

genotype = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_5x5', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 3), ('sep_conv_3x3', 2), ('sep_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('dil_conv_5x5', 2), ('dil_conv_5x5', 3), ('sep_conv_3x3', 1), ('sep_conv_3x3', 4), ('sep_conv_5x5', 1)], reduce_concat=range(2, 6))

kunalmessi10 commented 4 years ago

This is architecture I got normal_alphas ---> tensor([[0.0428, 0.1214, 0.0811, 0.0891, 0.2067, 0.1785, 0.1338, 0.1466], [0.0387, 0.1426, 0.0804, 0.1021, 0.1719, 0.1574, 0.1530, 0.1540], [0.0546, 0.1050, 0.0971, 0.0924, 0.1706, 0.1910, 0.1396, 0.1497], [0.0445, 0.1412, 0.1362, 0.1258, 0.1327, 0.1574, 0.1360, 0.1261], [0.0649, 0.1036, 0.1052, 0.0952, 0.1672, 0.1853, 0.1218, 0.1568], [0.0647, 0.1160, 0.1005, 0.0918, 0.1949, 0.1668, 0.1116, 0.1538], [0.0577, 0.1134, 0.0930, 0.1123, 0.1778, 0.1929, 0.1167, 0.1362], [0.0753, 0.0922, 0.0894, 0.1106, 0.1702, 0.1861, 0.1513, 0.1249], [0.0662, 0.0930, 0.0972, 0.0918, 0.1841, 0.1787, 0.1170, 0.1720], [0.0658, 0.1119, 0.0828, 0.0787, 0.2256, 0.2250, 0.1110, 0.0992], [0.0671, 0.1051, 0.0855, 0.0923, 0.1946, 0.1994, 0.1088, 0.1472], [0.0633, 0.1241, 0.1018, 0.0827, 0.2019, 0.1626, 0.1299, 0.1337], [0.0662, 0.1256, 0.0982, 0.0829, 0.1919, 0.1486, 0.1255, 0.1612], [0.0576, 0.1128, 0.0855, 0.0716, 0.1836, 0.1938, 0.1781, 0.1170]], device='cuda:0', grad_fn=)

reduce_alphas ---> tensor([[0.1187, 0.1201, 0.0942, 0.1089, 0.2002, 0.1300, 0.0951, 0.1328], [0.1623, 0.1293, 0.1215, 0.1230, 0.1330, 0.1203, 0.1079, 0.1027], [0.1003, 0.1067, 0.0917, 0.0978, 0.2270, 0.1542, 0.0797, 0.1427], [0.0974, 0.1663, 0.1186, 0.1336, 0.1378, 0.1487, 0.0811, 0.1165], [0.1370, 0.1187, 0.0962, 0.1174, 0.1279, 0.1356, 0.1009, 0.1663], [0.1018, 0.1252, 0.1185, 0.1088, 0.1425, 0.1502, 0.1512, 0.1018], [0.1073, 0.1334, 0.1000, 0.1373, 0.1797, 0.1576, 0.0918, 0.0929], [0.1223, 0.1598, 0.1124, 0.1403, 0.1138, 0.1180, 0.1129, 0.1204], [0.1164, 0.1137, 0.0890, 0.1190, 0.1332, 0.1016, 0.1291, 0.1980], [0.0977, 0.1114, 0.1015, 0.1171, 0.1864, 0.1777, 0.1089, 0.0993], [0.0909, 0.1153, 0.0887, 0.0997, 0.1779, 0.1837, 0.1229, 0.1209], [0.1042, 0.1514, 0.1091, 0.1285, 0.1254, 0.1521, 0.1111, 0.1183], [0.1044, 0.1176, 0.0911, 0.1128, 0.1139, 0.1846, 0.1261, 0.1495], [0.0951, 0.0986, 0.0768, 0.0992, 0.2042, 0.1756, 0.1103, 0.1402]], device='cuda:0', grad_fn=)

kunalmessi10 commented 4 years ago

From these alphas can you tell what would be my final arch by doing the skip connect trick you mentioned

yuhuixu1993 commented 4 years ago

@kunalmessi10, it seems ok, thanks.

kunalmessi10 commented 4 years ago

Can you tell me the new genotypes after replacing the skip connect as you suggested? This architecture is although training fine yet

yuhuixu1993 commented 4 years ago

@kunalmessi10, you can check if it is right. Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('skip_connect', 1), ('sep_conv_5x5', 0), ('skip_connect', 1), ('sep_conv_3x3', 3), ('sep_conv_3x3', 2), ('sep_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('dil_conv_5x5', 2), ('dil_conv_5x5', 3), ('sep_conv_3x3', 1), ('sep_conv_3x3', 4), ('sep_conv_5x5', 1)], reduce_concat=range(2, 6))

kunalmessi10 commented 4 years ago

Should we not add skip connect in the reduction cell?

yuhuixu1993 commented 4 years ago

@kunalmessi10, I think reduction cell is ok without skip.

kunalmessi10 commented 4 years ago

okay, I'll try that