Open qiaoyu1002 opened 2 years ago
Dear authors,
Could please check the log attached, where the problem is? My program is stucked on.
arguments: 1 0x7ffede2239ec 0.010 0.10 0 58 128 3 1234 1 1 3 0x555972b71bb0 0x5559787dcfe0 0x555953d49250 (nil) reassignments threshold: 0 yinyang groups: 0 [0] dest: 0x7f751b200000 - 0x7f751b207400 (29696) [0] device_centroids: 0x7f751b207400 - 0x7f751b207a00 (1536) [0] device_assignments: 0x7f751b207a00 - 0x7f751b207ae8 (232) [0] device_assignments_prev: 0x7f751b207c00 - 0x7f751b207ce8 (232) [0] device_ccounts: 0x7f751b207e00 - 0x7f751b207e0c (12) GPU #0 memory: used 1662779392 bytes (3.3%), free 49387536384 bytes, total 51050315776 bytes GPU #0 has 49152 bytes of shared memory per block transposing the samples... transpose <<<(4, 2), (32, 8)>>> 58, 128 performing kmeans++... kmeans++: dump 58 128 0x555973d8eab0 kmeans++: dev #0: 0x7f751b200000 0x7f751b207400 0x7f751b207a00 step 1[0] dev_dists: 0x7f751b208000 - 0x7f751b208040 (64) step 2[0] dev_dists: 0x7f751b208000 - 0x7f751b208040 (64) done too few clusters for this yinyang_t => Lloyd plans: [(0, 58)] planc: [(0, 3)] iteration 1: 58 reassignments iteration 2: 5 reassignments iteration 3: 3 reassignments iteration 4: 3 reassignments iteration 5: 3 reassignments iteration 6: 1 reassignments iteration 7: 4 reassignments iteration 8: 8 reassignments iteration 9: 3 reassignments iteration 10: 4 reassignments iteration 11: 2 reassignments iteration 12: 1 reassignments iteration 13: 1 reassignments iteration 14: 0 reassignments return kmcudaSuccess arguments: 1 0x7ffede2239ec 0.010 0.10 0 87 128 3 1234 1 1 3 0x555977d4d8c0 0x5559778afe80 0x555953dbc9d0 (nil) reassignments threshold: 0 yinyang groups: 0 [0] dest: 0x7f751b200000 - 0x7f751b20ae00 (44544) [0] device_centroids: 0x7f751b20ae00 - 0x7f751b20b400 (1536) [0] device_assignments: 0x7f751b20b400 - 0x7f751b20b55c (348) [0] device_assignments_prev: 0x7f751b20b600 - 0x7f751b20b75c (348) [0] device_ccounts: 0x7f751b20b800 - 0x7f751b20b80c (12) GPU #0 memory: used 1660682240 bytes (3.3%), free 49389633536 bytes, total 51050315776 bytes GPU #0 has 49152 bytes of shared memory per block transposing the samples... transpose <<<(4, 3), (32, 8)>>> 87, 128 performing kmeans++... kmeans++: dump 87 128 0x5559540f44c0 kmeans++: dev #0: 0x7f751b200000 0x7f751b20ae00 0x7f751b20b400 step 1[0] dev_dists: 0x7f751b20ba00 - 0x7f751b20ba40 (64) step 2[0] dev_dists: 0x7f751b20ba00 - 0x7f751b20ba40 (64) done too few clusters for this yinyang_t => Lloyd plans: [(0, 87)] planc: [(0, 3)] iteration 1: 87 reassignments iteration 2: 9 reassignments iteration 3: 5 reassignments iteration 4: 7 reassignments iteration 5: 3 reassignments iteration 6: 3 reassignments iteration 7: 1 reassignments iteration 8: 3 reassignments iteration 9: 1 reassignments iteration 10: 4 reassignments iteration 11: 3 reassignments iteration 12: 2 reassignments iteration 13: 0 reassignments return kmcudaSuccess arguments: 1 0x7ffede2239ec 0.010 0.10 0 25 128 3 1234 1 1 3 0x555978938ac0 0x555978947750 0x5559540c4d30 (nil) reassignments threshold: 0 yinyang groups: 0 [0] dest: 0x7f751b200000 - 0x7f751b203200 (12800) [0] device_centroids: 0x7f751b203200 - 0x7f751b203800 (1536) [0] device_assignments: 0x7f751b203800 - 0x7f751b203864 (100) [0] device_assignments_prev: 0x7f751b203a00 - 0x7f751b203a64 (100) [0] device_ccounts: 0x7f751b203c00 - 0x7f751b203c0c (12) GPU #0 memory: used 1658585088 bytes (3.2%), free 49391730688 bytes, total 51050315776 bytes GPU #0 has 49152 bytes of shared memory per block transposing the samples... transpose <<<(4, 1), (32, 8)>>> 25, 128 performing kmeans++... kmeans++: dump 25 128 0x555973d97f80 kmeans++: dev #0: 0x7f751b200000 0x7f751b203200 0x7f751b203800 step 1[0] dev_dists: 0x7f751b203e00 - 0x7f751b203e40 (64) step 2[0] dev_dists: 0x7f751b203e00 - 0x7f751b203e40 (64) done too few clusters for this yinyang_t => Lloyd plans: [(0, 25)] planc: [(0, 3)] iteration 1: 25 reassignments iteration 2: 2 reassignments iteration 3: 3 reassignments iteration 4: 1 reassignments iteration 5: 1 reassignments iteration 6: 2 reassignments iteration 7: 1 reassignments iteration 8: 0 reassignments return kmcudaSuccess arguments: 1 0x7ffede2239ec 0.010 0.10 0 4 128 3 1234 1 1 3 0x5559771795d0 0x555978956050 0x555895ca28c0 (nil) reassignments threshold: 0 yinyang groups: 0 [0] dest: 0x7f751b200000 - 0x7f751b200800 (2048) [0] device_centroids: 0x7f751b200800 - 0x7f751b200e00 (1536) [0] device_assignments: 0x7f751b200e00 - 0x7f751b200e10 (16) [0] device_assignments_prev: 0x7f751b201000 - 0x7f751b201010 (16) [0] device_ccounts: 0x7f751b201200 - 0x7f751b20120c (12) GPU #0 memory: used 1656487936 bytes (3.2%), free 49393827840 bytes, total 51050315776 bytes GPU #0 has 49152 bytes of shared memory per block transposing the samples... transpose <<<(4, 1), (32, 8)>>> 4, 128 performing kmeans++... kmeans++: dump 4 128 0x555954856b10 kmeans++: dev #0: 0x7f751b200000 0x7f751b200800 0x7f751b200e00 step 1[0] dev_dists: 0x7f751b201400 - 0x7f751b201440 (64) step 2[0] dev_dists: 0x7f751b201400 - 0x7f751b201440 (64) done too few clusters for this yinyang_t => Lloyd plans: [(0, 4)] planc: [(0, 3)] iteration 1: 4 reassignments iteration 2: 0 reassignments return kmcudaSuccess 4%|████▌ | 2/50 [00:25<11:21, 14.20s/it] | Global Training Round : 3 |
arguments: 1 0x7ffede2239ec 0.010 0.10 0 232 128 3 1234 1 1 3 0x5559736ccd30 0x555977a18c90 0x555954115a90 (nil) reassignments threshold: 2 yinyang groups: 0 [0] *dest: 0x7f751b200000 - 0x7f751b21d000 (118784) [0] device_centroids: 0x7f751b21d000 - 0x7f751b21d600 (1536) [0] device_assignments: 0x7f751b21d600 - 0x7f751b21d9a0 (928) [0] device_assignments_prev: 0x7f751b21da00 - 0x7f751b21dda0 (928) [0] device_ccounts: 0x7f751b21de00 - 0x7f751b21de0c (12) GPU #0 memory: used 1654390784 bytes (3.2%), free 49395924992 bytes, total 51050315776 bytes GPU #0 has 49152 bytes of shared memory per block transposing the samples... transpose <<<(8, 4), (8, 32)>>> 232, 128, xyswap =================> Here stuck on
Thanks in advance for your analysis. If you need more detailed information about this, please let me know.
Dear authors,
Could please check the log attached, where the problem is? My program is stucked on.
arguments: 1 0x7ffede2239ec 0.010 0.10 0 58 128 3 1234 1 1 3 0x555972b71bb0 0x5559787dcfe0 0x555953d49250 (nil) reassignments threshold: 0 yinyang groups: 0 [0] dest: 0x7f751b200000 - 0x7f751b207400 (29696) [0] device_centroids: 0x7f751b207400 - 0x7f751b207a00 (1536) [0] device_assignments: 0x7f751b207a00 - 0x7f751b207ae8 (232) [0] device_assignments_prev: 0x7f751b207c00 - 0x7f751b207ce8 (232) [0] device_ccounts: 0x7f751b207e00 - 0x7f751b207e0c (12) GPU #0 memory: used 1662779392 bytes (3.3%), free 49387536384 bytes, total 51050315776 bytes GPU #0 has 49152 bytes of shared memory per block transposing the samples... transpose <<<(4, 2), (32, 8)>>> 58, 128 performing kmeans++... kmeans++: dump 58 128 0x555973d8eab0 kmeans++: dev #0: 0x7f751b200000 0x7f751b207400 0x7f751b207a00 step 1[0] dev_dists: 0x7f751b208000 - 0x7f751b208040 (64) step 2[0] dev_dists: 0x7f751b208000 - 0x7f751b208040 (64) done
too few clusters for this yinyang_t => Lloyd plans: [(0, 58)] planc: [(0, 3)] iteration 1: 58 reassignments iteration 2: 5 reassignments iteration 3: 3 reassignments iteration 4: 3 reassignments iteration 5: 3 reassignments iteration 6: 1 reassignments iteration 7: 4 reassignments iteration 8: 8 reassignments iteration 9: 3 reassignments iteration 10: 4 reassignments iteration 11: 2 reassignments iteration 12: 1 reassignments iteration 13: 1 reassignments iteration 14: 0 reassignments return kmcudaSuccess arguments: 1 0x7ffede2239ec 0.010 0.10 0 87 128 3 1234 1 1 3 0x555977d4d8c0 0x5559778afe80 0x555953dbc9d0 (nil) reassignments threshold: 0 yinyang groups: 0 [0] dest: 0x7f751b200000 - 0x7f751b20ae00 (44544) [0] device_centroids: 0x7f751b20ae00 - 0x7f751b20b400 (1536) [0] device_assignments: 0x7f751b20b400 - 0x7f751b20b55c (348) [0] device_assignments_prev: 0x7f751b20b600 - 0x7f751b20b75c (348) [0] device_ccounts: 0x7f751b20b800 - 0x7f751b20b80c (12) GPU #0 memory: used 1660682240 bytes (3.3%), free 49389633536 bytes, total 51050315776 bytes GPU #0 has 49152 bytes of shared memory per block transposing the samples... transpose <<<(4, 3), (32, 8)>>> 87, 128 performing kmeans++... kmeans++: dump 87 128 0x5559540f44c0 kmeans++: dev #0: 0x7f751b200000 0x7f751b20ae00 0x7f751b20b400 step 1[0] dev_dists: 0x7f751b20ba00 - 0x7f751b20ba40 (64) step 2[0] dev_dists: 0x7f751b20ba00 - 0x7f751b20ba40 (64) done
too few clusters for this yinyang_t => Lloyd plans: [(0, 87)] planc: [(0, 3)] iteration 1: 87 reassignments iteration 2: 9 reassignments iteration 3: 5 reassignments iteration 4: 7 reassignments iteration 5: 3 reassignments iteration 6: 3 reassignments iteration 7: 1 reassignments iteration 8: 3 reassignments iteration 9: 1 reassignments iteration 10: 4 reassignments iteration 11: 3 reassignments iteration 12: 2 reassignments iteration 13: 0 reassignments return kmcudaSuccess arguments: 1 0x7ffede2239ec 0.010 0.10 0 25 128 3 1234 1 1 3 0x555978938ac0 0x555978947750 0x5559540c4d30 (nil) reassignments threshold: 0 yinyang groups: 0 [0] dest: 0x7f751b200000 - 0x7f751b203200 (12800) [0] device_centroids: 0x7f751b203200 - 0x7f751b203800 (1536) [0] device_assignments: 0x7f751b203800 - 0x7f751b203864 (100) [0] device_assignments_prev: 0x7f751b203a00 - 0x7f751b203a64 (100) [0] device_ccounts: 0x7f751b203c00 - 0x7f751b203c0c (12) GPU #0 memory: used 1658585088 bytes (3.2%), free 49391730688 bytes, total 51050315776 bytes GPU #0 has 49152 bytes of shared memory per block transposing the samples... transpose <<<(4, 1), (32, 8)>>> 25, 128 performing kmeans++... kmeans++: dump 25 128 0x555973d97f80 kmeans++: dev #0: 0x7f751b200000 0x7f751b203200 0x7f751b203800 step 1[0] dev_dists: 0x7f751b203e00 - 0x7f751b203e40 (64) step 2[0] dev_dists: 0x7f751b203e00 - 0x7f751b203e40 (64) done
too few clusters for this yinyang_t => Lloyd plans: [(0, 25)] planc: [(0, 3)] iteration 1: 25 reassignments iteration 2: 2 reassignments iteration 3: 3 reassignments iteration 4: 1 reassignments iteration 5: 1 reassignments iteration 6: 2 reassignments iteration 7: 1 reassignments iteration 8: 0 reassignments return kmcudaSuccess arguments: 1 0x7ffede2239ec 0.010 0.10 0 4 128 3 1234 1 1 3 0x5559771795d0 0x555978956050 0x555895ca28c0 (nil) reassignments threshold: 0 yinyang groups: 0 [0] dest: 0x7f751b200000 - 0x7f751b200800 (2048) [0] device_centroids: 0x7f751b200800 - 0x7f751b200e00 (1536) [0] device_assignments: 0x7f751b200e00 - 0x7f751b200e10 (16) [0] device_assignments_prev: 0x7f751b201000 - 0x7f751b201010 (16) [0] device_ccounts: 0x7f751b201200 - 0x7f751b20120c (12) GPU #0 memory: used 1656487936 bytes (3.2%), free 49393827840 bytes, total 51050315776 bytes GPU #0 has 49152 bytes of shared memory per block transposing the samples... transpose <<<(4, 1), (32, 8)>>> 4, 128 performing kmeans++... kmeans++: dump 4 128 0x555954856b10 kmeans++: dev #0: 0x7f751b200000 0x7f751b200800 0x7f751b200e00 step 1[0] dev_dists: 0x7f751b201400 - 0x7f751b201440 (64) step 2[0] dev_dists: 0x7f751b201400 - 0x7f751b201440 (64) done
too few clusters for this yinyang_t => Lloyd plans: [(0, 4)] planc: [(0, 3)] iteration 1: 4 reassignments iteration 2: 0 reassignments return kmcudaSuccess 4%|████▌ | 2/50 [00:25<11:21, 14.20s/it] | Global Training Round : 3 |
arguments: 1 0x7ffede2239ec 0.010 0.10 0 232 128 3 1234 1 1 3 0x5559736ccd30 0x555977a18c90 0x555954115a90 (nil) reassignments threshold: 2 yinyang groups: 0 [0] *dest: 0x7f751b200000 - 0x7f751b21d000 (118784) [0] device_centroids: 0x7f751b21d000 - 0x7f751b21d600 (1536) [0] device_assignments: 0x7f751b21d600 - 0x7f751b21d9a0 (928) [0] device_assignments_prev: 0x7f751b21da00 - 0x7f751b21dda0 (928) [0] device_ccounts: 0x7f751b21de00 - 0x7f751b21de0c (12) GPU #0 memory: used 1654390784 bytes (3.2%), free 49395924992 bytes, total 51050315776 bytes GPU #0 has 49152 bytes of shared memory per block transposing the samples... transpose <<<(8, 4), (8, 32)>>> 232, 128, xyswap =================> Here stuck on
Thanks in advance for your analysis. If you need more detailed information about this, please let me know.