Closed zhongpeixiang closed 4 years ago
Thanks for the interest.
resource.txt is processed ConceptNet followed CCM work. The conversation dataset is based on CCM work, which is extended to cover two-hop concepts and relations between one-hops.
Current code and data in this repo are temporary placement for camera-ready. We will release full data and code soon.
Hi, thanks for sharing the code!
I wonder if you could share some examples of the trainset.txt? (one or two items in the trainset.txt is all I need)
So that I can build the data into the correct format by myself, thanks!
Thanks for sharing the code! But I'm also interested in the dataset and how it can be processed from the CMM dataset into the one you're using now. So the question is, when will the process and the finished data set be made public. Thanks!
Hi, thanks for sharing the code!
I wonder if you could share some examples of the trainset.txt? (one or two items in the trainset.txt is all I need)
So that I can build the data into the correct format by myself, thanks!
Thanks for the interest! Below is an example of one data point: {"all_entities_one_hop": [515, 3078, 21004, 4627, 21012, 15892, 19477, 6680, 3609, 6174, 6175, 14371, 21027, 12840, 7723, 1580, 14379, 7214, 17969, 22580, 4661, 17462, 15415, 69, 13384, 18507, 5709, 14928, 15441, 82, 2128, 84, 15449, 11865, 3679, 18017, 1637, 18023, 6253, 5243, 20603, 8317, 18049, 16003, 18052, 10888, 6281, 9361, 12946, 5265, 5270, 1688, 2712, 10907, 16042, 17582, 6318, 3249, 7859, 10423, 14522, 7357, 18628, 11463, 10954, 6349, 1741, 6353, 19669, 14550, 4824, 12514, 13027, 9956, 229, 14568, 9962, 18156, 14062, 18160, 11508, 19702, 4856, 7423, 6913, 7435, 7951, 14611, 16664, 1818, 9724, 11549, 22304, 5921, 5412, 21808, 21297, 13111, 11582, 20289, 14659, 15172, 11083, 5451, 17234, 6491, 12126, 7518, 12645, 15209, 12142, 5999, 7536, 14704, 2935, 19832, 5500, 15229, 19844, 8586, 398, 11152, 15259, 12705, 15779, 11171, 12218, 16826, 8124, 17342, 1471, 20928, 17345, 18368, 457, 8139, 2001, 8151, 8158, 11234, 13796, 20457, 7664, 2037, 4604], "post_triples": [1, 0, 0, 0, 0, 2, 3, 0, 0, 4, 0, 0, 0, 0, 5, 0, 6, 7, 0, 0, 0, 1, 0, 8, 0, 0, 0, 0, 0, 9, 0, 0, 10], "post": ["hh", "can", "be", "used", "to", "put", "pressure", "on", "your", "opponent", ".", "so", "not", "just", "pissing", "off", "stronger", "rivals", ",", "if", "your", "hh", "is", "strong", "enough", "you", "can", "make", "them", "weak", "in", "the", "battle"], "response": ["like", "taking", "a", "person", "who", "outclasses", "you", ",", "then", "bringing", "them", "down", "to", "your", "level", "?", "maybe", "that", "'s", "how", "he", "takes", "out", "mihawk", "?"], "match_response_index_one_hop": [-1, -1, -1, 12218, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1], "only_two": [5174, 12192, 3075, 5036, 17221, 2432, 742, 5651, 1807, 12094, 9752, 1086, 13846, 5800, 13031, 21730, 1704, 1917, 8809, 2575, 15904, 14904, 4607, 14505, 20754, 7547, 2422, 21235, 13684, 6391, 19705, 4999, 13335, 2238, 11640, 7737, 10419, 13160, 8513, 13505, 7831, 701, 6647, 20650, 13639, 8875, 11583, 4515, 5013, 8428, 15338, 19557, 21002, 2192, 6314, 8928, 20643, 1038, 20303, 9881, 12529, 9907, 8545, 13861, 13385, 7037, 7537, 7284, 16900, 385, 5477, 7939, 3571, 12511, 10886, 19958, 9023, 15649, 8699, 6805, 1974, 21901, 19390, 20572, 20571, 4866, 15044, 4402, 14723, 21446, 19900, 14538, 13903, 18042, 6717, 22025, 15394, 6387, 12417, 2726], "match_response_index_only_two": [-1, 15394, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 4402, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1], "all_triples_one_hop": [30722, 63497, 34825, 20494, 22552, 75802, 4129, 16430, 51249, 100402, 114744, 79934, 84037, 55372, 24653, 45141, 18526, 14432, 96352, 22636, 43121, 41080, 102524, 92285, 84098, 96387, 82054, 43142, 67722, 88205, 41101, 2201, 108702, 116895, 110755, 49316, 104617, 14514, 189, 28862, 118980, 18628, 43210, 22731, 30926, 106703, 4313, 8410, 108770, 57571, 41186, 18667, 41197, 4335, 108784, 104689, 106749, 47363, 12549, 100615, 102664, 84232, 53522, 20756, 84245, 80154, 110890, 14635, 92462, 14638, 88366, 33076, 86329, 22844, 67907, 26948, 86340, 14668, 22875, 88414, 78180, 65896, 80233, 57704, 14696, 65900, 31084, 113004, 61809, 6519, 20859, 117121, 63873, 72067, 63880, 405, 86436, 104870, 39344, 108983, 55739, 115133, 92627, 92639, 68076, 59889, 100850, 61939, 100856, 43516, 109059, 84488, 12814, 88591, 59923, 119321, 55835, 43558, 23079, 86575, 49728, 45645, 66126, 113233, 8786, 66142, 80488, 51817, 55916, 113262, 57968, 633, 82555, 29319, 14984, 111246, 101009, 6803, 68243, 49819, 55964, 98975, 23201, 55971, 53927, 25263, 47810, 49859, 31437, 43726, 45779, 25302, 117469, 27358, 88802, 8934, 45799, 90856, 117481, 82666, 92911, 68337, 97021, 47875, 29445, 62213, 109317, 41753, 72479, 76576, 117539, 64293, 64296, 113452, 9009, 37681, 35636, 97076, 60215, 9016, 109369, 45886, 35646, 60223, 23361, 109377, 41795, 113478, 88906, 66383, 86866, 101202, 853, 47958, 11094, 11096, 115544, 35672, 858, 56172, 97136, 29554, 54131, 25458, 95098, 62331, 27523, 103306, 2957, 9105, 117650, 43927, 76696, 109469, 115614, 50079, 35752, 64430, 84912, 66482, 97203, 58302, 54216, 25545, 97226, 50125, 46029, 99279, 109521, 58323, 58330, 93153, 56292, 29672, 1002, 35819, 41966, 113660, 113664, 3084, 7184, 70672, 103446, 37912, 46105, 3099, 1051, 23581, 5168, 82999, 68672, 33861, 7244, 101454, 56398, 50257, 119894, 101463, 44120, 58461, 13412, 70757, 3173, 21610, 56442, 19580, 60544, 89224, 11402, 7307, 11408, 7329, 48291, 74917, 19625, 27819, 36011, 103598, 111791, 64688, 29874, 93365, 60601, 19642, 105660, 113858, 115907, 1224, 38093, 101583, 117967, 5331, 70868, 27865, 29915, 48348, 56541, 5342, 79074, 70886, 97510, 81126, 15593, 27885, 7409, 19699, 93433, 64764, 25853, 38161, 15633, 32042, 64811, 81194, 91441, 13622, 89403, 56644, 77131, 25938, 11606, 109914, 58718, 109920, 120167, 87402, 15727, 30070, 40310, 62847, 40322, 97668, 5512, 109962, 7569, 26002, 17812, 58777, 21916, 50589, 112033, 73128, 81325, 56753, 112058, 3514, 95678, 48581, 107974, 42443, 52683, 99791, 3535, 9679, 42452, 17878, 52710, 95721, 89579, 36346, 15869, 62975, 3588, 58887, 95751, 77329, 44561, 9748, 93729, 30243, 77348, 85550, 81459, 67124, 97848, 63034, 9793, 17986, 7746, 38468, 87617, 34369, 75332, 71240, 40521, 77391, 13907, 9818, 38497, 11885, 5743, 46708, 87672, 116345, 28284, 89724, 30341, 46730, 11916, 104076, 3726, 97940, 63125, 75436, 30388, 54967, 89796, 28358, 85704, 42698, 61133, 79567, 118485, 52950, 18135, 20189, 52975, 1778, 67317, 28408, 55035, 83708, 75516, 26384, 20240, 75537, 104216, 104217, 104224, 44834, 85798, 73513, 57130, 112426, 48939, 48947, 44852, 83768, 94012, 57149, 104260, 34629, 73542, 108363, 112463, 22356, 104277, 42838, 112472, 81752, 30554, 61274, 48993, 98146, 16228, 120679, 69480, 5993, 104299, 3948, 71533, 36718, 51063, 87930, 30589, 94080, 63371, 22414, 110479, 18322, 20373, 26521, 1946, 112544, 32681, 10155, 24496, 38834, 96179, 116660, 92094, 34751, 83905, 104387, 40901, 4042, 81868, 88012, 100305, 14291, 36823, 112600, 79833, 112608, 118754, 94180, 49126, 22504, 12265, 22511, 96246, 86008, 57339], "one_two_triple": [[], [120480, 115331, 39976, 99278, 63663, 15699, 47833, 90685], [59010, 60579, 107336, 56078, 114289, 31674, 83228], [117473], [94928, 105050, 34427, 120206], [100703, 53534, 56207], [], [74976, 3215, 67218, 27356, 42271], [76513, 22604, 40696, 111672, 108857, 27196], [100416, 77568, 32579, 22214, 69961, 5356, 82992, 110003, 105430, 14873, 45723], [93794, 9859, 79012, 88997, 66567, 9769, 65131, 13004, 28334, 68879, 111983, 371, 52985, 88121], [112226], [93953, 64611, 105028, 22851, 72913, 15956, 48437, 45914, 94523], [], [], [33604, 33447, 65897, 75083, 47537, 89107, 42836, 80564, 55703, 27545, 48378, 56572], [65283, 5704, 77483, 20621, 86355, 23962, 84123, 117567], [8135, 57832, 86252, 95186, 22782], [46241, 92801, 136, 120841, 110415, 37651, 113404], [73211, 86510], [85541, 32909, 18324, 87158, 5470], [113336], [81059, 108548, 4678, 19499, 80141, 47893, 82165, 33625, 26525, 24543], [57386, 94543], [81927, 50663, 72111, 68764, 86807, 95484, 22302], [78566, 106428, 42738, 107835, 11132], [], [117696, 3046, 84231, 100136, 111050, 3436, 16910, 1360, 87039], [90002, 33283], [15520, 81472, 68610, 30149, 103437, 108217, 106266], [82161, 83438], [62574], [71748], [79201, 71873, 116131, 71144, 29641, 105384, 14379, 70700, 37641, 117391, 12402, 72374, 40378, 21853, 76989], [37444], [30977, 8129, 12675, 65060, 60547, 34597, 29801, 99375, 113524, 81274, 98238], [108767], [60769, 54972], [15392, 107651, 13790], [], [108978, 53446], [], [98691, 52197, 110955, 110956, 21870, 41711, 109071, 70062, 63670], [], [102248, 24490, 73933, 8240, 33946, 64127], [36256, 38817, 41923, 13575, 113962, 103692, 86546, 92441, 22202, 66747], [23299, 68585, 56590, 52982, 60958], [89113], [28200, 39814], [], [116857, 78721, 64342, 6278], [75363, 100451, 48230, 39270, 7944, 15722, 73099, 36043, 23888, 43609, 93850, 57502], [44065, 111842, 77922, 76485, 32301, 64464, 112531, 77399], [8542, 49443, 22630, 19692, 44461, 46096, 99710], [54304, 91877, 42089, 94218, 78100, 17368, 8797], [71106, 9886, 103621, 71406], [92389], [27421], [97372, 109613], [], [59649, 32961, 22980, 42021, 42983, 109738, 682, 36180, 118518, 20824, 104250, 106141, 111615], [76946], [21780], [114146, 108483, 14191, 90481, 1973, 6265], [43973, 78186, 17997, 104690, 55800, 21818], [22750, 60268, 42381, 106997, 6269, 42334], [62434, 50471, 86488, 85307, 86104], [91058, 118988, 53054], [81833, 105153, 90974, 71519], [4382, 113985, 34083, 72359, 29227, 111231, 76561, 101365, 108214, 111487, 52734, 9343], [94790, 49256, 55304, 99372, 9070, 116528, 100626, 46108, 34237, 22623], [85219, 95845, 79309, 106102, 113498, 68959], [72129, 64398, 89486, 8401, 78771, 4791, 94680, 62780], [45243, 70836, 24884, 21307, 51485, 60543], [], [], [66904, 116121, 120591, 55599], [97974], [], [84137], [92640, 48365], [], [47475], [], [120816, 99260], [], [102499, 28488, 1385, 12403, 63926, 3869], [85470, 112350], [110051, 31812, 71560, 40369, 106616, 81181], [45636, 41510, 74634, 69136, 9590], [97223], [], [7417, 73746], [], [38179, 60051, 102725], [114599, 5033, 70602, 56493, 76111, 5043, 30932, 75989, 17114, 116767], [57921, 90915, 51395, 5607, 81052, 66301, 65535], [18632, 94041, 6707], [65665, 35851, 66092, 18351, 54069], [100034, 51460, 95797, 4908], [95681, 48886, 88479], [10469], [24822], [13605, 5961, 67660, 16571, 71025, 24305, 7891, 7315, 120791, 46297, 73147, 31806], [5348, 72235, 335, 65975, 98137, 33145, 100442, 68413, 31839], [60852, 37734, 30351], [], [72965, 7504, 109045, 45589, 13337, 109437], [75145], [24837, 102129, 15668, 37559, 25880, 2367], [3363, 42308, 110777, 62125, 15476, 29399, 64889, 88123], [], [98037, 104582, 97855], [], [3843, 32398, 114225, 70297, 35128, 61177, 7288, 64349], [23294], [13761, 111490, 5091, 100168, 71848, 94314, 18800, 36307, 113108, 44597, 5080, 58367], [68829, 88165, 98543], [], [], [88763, 48807], [51075, 75997], [], [102566], [104965, 49349, 95562, 24620, 113790, 85246], [100497], [], [119832, 5187, 11372, 8773], [51602], [43715, 18379, 43285, 58554, 75101], [119360, 8468, 71391], [80514, 104003, 52615, 112954, 108173, 70829, 37148, 79125, 55000, 116954, 22235, 35996], [106676, 15526], [109540, 68328, 113332, 114388, 79737], [34480, 52665, 33832, 24269], [15563, 71441, 22674, 66331, 78556], [45562, 47027, 84935], [], [34537, 19247, 33968, 66033, 32118, 53180], [27867, 87742], [], [3547, 7092, 105503], [106498, 41635]]}
Thanks for sharing the code! But I'm also interested in the dataset and how it can be processed from the CMM dataset into the one you're using now. So the question is, when will the process and the finished data set be made public. Thanks!
Thanks for the interest!
Due to the policy of the Reddit, we are not able to share the dataset in this repo. However, I'll post an email in the repo so anyone with interests can request the data by email.
We are cleaning up the code and data, and this repo will be updated after ACL 2020.
Thanks for sharing the code! But I'm also interested in the dataset and how it can be processed from the CMM dataset into the one you're using now. So the question is, when will the process and the finished data set be made public. Thanks!
Thanks for the interest! Due to the policy of the Reddit, we are not able to share the dataset in this repo. However, I'll post an email in the repo so anyone with interests can request the data by email. We are cleaning up the code and data, and this repo will be updated after ACL 2020.
I didn't see your email address, so will the email address be written after ACL2020? Can I then send you an email requesting data?
Thanks for sharing the code! But I'm also interested in the dataset and how it can be processed from the CMM dataset into the one you're using now. So the question is, when will the process and the finished data set be made public. Thanks! Thanks for the interest! Due to the policy of the Reddit, we are not able to share the dataset in this repo. However, I'll post an email in the repo so anyone with interests can request the data by email. We are cleaning up the code and data, and this repo will be updated after ACL 2020.
I didn't see your email address, so will the email address be written after ACL2020? Can I then send you an email requesting data?
Yes I'll post an email address in the repo once it's ready. You can then request the data by email.
Thank you very much for sharing your work!
I would like to apply your model on another dataset. May I know how a raw conversation dataset is processed into resource.txt, trainset.txt, etc.
I'm aware that your dataset is the same as the CCM dataset, but the CCM author didn't explain this.
Thanks, Peixiang