I have the following example:
Sentence A: a # 9.8 m deficit recorded for 2014/15 at an essex hospital is to be investigated by a health service watchdog.
Sentence B: A £9.8m deficit recorded for 2014/15 at an Essex hospital is to be investigated by a health service watchdog.
When I run the following:
myaligner = simalign.SentenceAligner(token_type="word")
aligns = myaligner.get_word_aligns(sentence_A, sentence_B)['itermax']
I have the following example: Sentence A: a # 9.8 m deficit recorded for 2014/15 at an essex hospital is to be investigated by a health service watchdog. Sentence B: A £9.8m deficit recorded for 2014/15 at an Essex hospital is to be investigated by a health service watchdog.
When I run the following: myaligner = simalign.SentenceAligner(token_type="word") aligns = myaligner.get_word_aligns(sentence_A, sentence_B)['itermax']
This produces an aligns of the form: [(0, 0), (2, 1), (4, 2), (5, 3), (6, 4), (7, 5), (8, 6), (9, 7), (10, 8), (11, 9), (12, 10), (13, 11), (14, 12), (15, 13), (16, 14), (17, 15), (18, 16), (19, 17), (20, 18)]
I cannot figure out how you then produce a matching of the form: [(0, 0), (1, 1), (2, 1), (3,1) (4, 2), (5, 3), (6, 4), (7, 5), (8, 6), (9, 7), (10, 8), (11, 9), (12, 10), (13, 11), (14, 12), (15, 13), (16, 14), (17, 15), (18, 16), (19, 17), (20, 18)]
This is done on the interactive website in order to produce the graphs but I cannot find where you do something of this form in the code provided.
Thanks in advance!