If you are using the mosei_senti_data.pkl and want to get the raw text by matching the id in mosei.hdf5, please consider to use the following script to process the data.
file1 = pickle.load(open('data/mosei_senti_data.pkl', 'rb'))
data = file1['test']['id']
# keep the first element and add the num.
modified_data = []
counters = {}
for element in tqdm(data, desc="Processing elements"):
key = element[0]
if key not in counters:
counters[key] = 0
modified_data.append(f"{key}[{counters[key]}]")
counters[key] += 1
file1['test']['id'] = np.array(modified_data)
with open('data/mosei_new.pkl', 'wb') as f:
pickle.dump(file1, f)
print('all done!')
If you are using the mosei_senti_data.pkl and want to get the raw text by matching the id in mosei.hdf5, please consider to use the following script to process the data.