import json
sample = json.loads(open('data/demo.json', 'r', encoding='utf-8').read())
print(len(sample['sentences']))
print(''.join([''.join(sent[1:-1]) for sent in sample['sentences']]))
print(''.join([' '.join(sent) for sent in sample['sentences']]).split(' ')[410:411])
print(''.join([' '.join(sent) for sent in sample['sentences']]).split(' ')[388:404])
# print(''.join([''.join(sent[1:-1]) for sent in sample['sentences']])[54:56])
# print(''.join([''.join(sent[1:-1]) for sent in sample['sentences']])[1244:1247])
# print(''.join([''.join(sent[1:-1]) for sent in sample['sentences']])[1292:1295])
您好,请问下clusters的list应该是同一个指代index吧,但是感觉原始文档区索引的时候发现不是匹配的?
比如下面是一个例子,答案为:
输出: