Closed alexanderkoller closed 5 years ago
Am I reading the code correctly that we can currently only have one TokenRange for each node in an amconll file?
This is limiting for UCCA, where we see e.g. the following annotation for 700009:
{"id": "700009", "flavor": 1, "framework": "ucca", "version": 0.9, "time": "2019-05-18 (06:35)", "input": "It included \"All Along the Watchtower\", with lyrics derived from the Book of Isaiah (21:5-9). The song was later recorded by Jimi Hendrix, whose version Dylan later acknowledged as definitive.", "tops": [37], "nodes": [{"id": 0, "anchors": [{"from": 0, "to": 2}]}, {"id": 1, "anchors": [{"from": 3, "to": 11}]}, {"id": 2, "anchors": [{"from": 12, "to": 13}]}, {"id": 3, "anchors": [{"from": 13, "to": 16}, {"from": 17, "to": 22}, {"from": 23, "to": 26}, {"from": 27, "to": 37}]}, ...
If I want to change this to support multiple token ranges per node, is there something that I should pay specific attention to?
@namednil @mariomgmn
Yes, that's correct. Maybe throw an exception in ranges() if one word has multiple token ranges?
ranges()
Not needed after all - mtool handles this gracefully.
Am I reading the code correctly that we can currently only have one TokenRange for each node in an amconll file?
This is limiting for UCCA, where we see e.g. the following annotation for 700009:
If I want to change this to support multiple token ranges per node, is there something that I should pay specific attention to?
@namednil @mariomgmn