UniversalDependencies / UD_English-GUM

Other
30 stars 4 forks source link

Missing multi-word token ranges #69

Closed rhdunn closed 11 months ago

rhdunn commented 11 months ago

The following sentences contain tokens that don't have multi-word token range annotations:

ERROR: Sentence GUM_conversation_grounded-10 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_conversation_grounded-48 token 10 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_grounded-51 token 15 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_grounded-61 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_conversation_grounded-65 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_conversation_grounded-66 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_conversation_grounded-110 token 5 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_grounded-115 token 11 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_grounded-121 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_conversation_grounded-140 token 3 -- multi-word continuation without a multi-word token range for 'what]['re'
ERROR: Sentence GUM_conversation_risk-95 token 11 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_risk-118 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_interview_gaming-19 token 4 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_interview_gaming-22 token 2 -- multi-word continuation without a multi-word token range for 'We]['re'
ERROR: Sentence GUM_speech_inauguration-18 token 16 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_speech_inauguration-32 token 37 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_portland-18 token 14 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_portland-18 token 33 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_portland-21 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_vlog_portland-24 token 15 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_portland-30 token 4 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_portland-30 token 11 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_portland-36 token 8 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_portland-36 token 12 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_portland-43 token 8 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_vlog_portland-43 token 16 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_vlog_radiology-19 token 2 -- multi-word continuation without a multi-word token range for 'We]['re'
ERROR: Sentence GUM_vlog_radiology-27 token 13 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_whow_overalls-40 token 13 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_lambada-11 token 9 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_lambada-12 token 3 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_retirement-9 token 15 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_retirement-11 token 4 -- multi-word continuation without a multi-word token range for 'ought][a'
ERROR: Sentence GUM_conversation_retirement-52 token 15 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_london-9 token 42 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_london-18 token 27 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_london-21 token 27 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_vlog_studying-2 token 10 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_studying-11 token 22 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_studying-23 token 30 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_vlog_studying-31 token 4 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_studying-40 token 8 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_vlog_studying-43 token 11 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_voyage_vavau-20 token 13 -- multi-word continuation without a multi-word token range for 'º][C'
ERROR: Sentence GUM_bio_marbles-17 token 17 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_conversation_atoms-4 token 7 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_atoms-15 token 11 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_atoms-42 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_conversation_atoms-80 token 4 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_atoms-139 token 16 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_atoms-142 token 12 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_atoms-146 token 11 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_blacksmithing-3 token 4 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_conversation_blacksmithing-7 token 2 -- multi-word continuation without a multi-word token range for 'We]['re'
ERROR: Sentence GUM_conversation_blacksmithing-38 token 21 -- multi-word continuation without a multi-word token range for 'kind][a'
ERROR: Sentence GUM_conversation_blacksmithing-54 token 10 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_blacksmithing-54 token 30 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_blacksmithing-62 token 4 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_blacksmithing-79 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_conversation_blacksmithing-82 token 5 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_blacksmithing-94 token 21 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_blacksmithing-95 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_conversation_blacksmithing-105 token 3 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_conversation_court-47 token 23 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_court-51 token 3 -- multi-word continuation without a multi-word token range for 'do][n-'
ERROR: Sentence GUM_conversation_court-64 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_conversation_court-66 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_conversation_court-67 token 3 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_court-81 token 6 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_court-89 token 20 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_conversation_erasmus-22 token 21 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_conversation_family-42 token 13 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_family-52 token 3 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_family-61 token 8 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_family-73 token 5 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_family-114 token 3 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_family-157 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_conversation_gossip-31 token 21 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_gossip-74 token 30 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_gossip-93 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_conversation_gossip-94 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_conversation_gossip-98 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_conversation_gossip-116 token 2 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_gossip-135 token 4 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_scientist-5 token 12 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_conversation_scientist-7 token 69 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_scientist-7 token 74 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_scientist-37 token 17 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_scientist-37 token 24 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_scientist-42 token 3 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_scientist-42 token 8 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_vet-8 token 5 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_conversation_vet-23 token 4 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_vet-24 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_conversation_vet-77 token 6 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_conversation_vet-79 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_conversation_zero-29 token 11 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_zero-63 token 18 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_conversation_zero-103 token 8 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_fiction_frankenstein-75 token 16 -- multi-word continuation without a multi-word token range for 'a][while'
ERROR: Sentence GUM_fiction_pag-91 token 3 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_fiction_pag-91 token 6 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_fiction_pag-92 token 3 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_fiction_pag-102 token 5 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_fiction_pag-107 token 17 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_fiction_pag-108 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_fiction_pag-110 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_fiction_rose-27 token 3 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_fiction_wedding-30 token 30 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_interview_ants-54 token 16 -- multi-word continuation without a multi-word token range for 'petri][dishes'
ERROR: Sentence GUM_interview_cocktail-58 token 18 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_interview_dungeon-42 token 2 -- multi-word continuation without a multi-word token range for 'We]['re'
ERROR: Sentence GUM_interview_herrick-13 token 2 -- multi-word continuation without a multi-word token range for 'We]['re'
ERROR: Sentence GUM_interview_mckenzie-36 token 40 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_interview_messina-37 token 7 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_interview_messina-42 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_interview_shalev-27 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_interview_shalev-45 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_interview_stardust-23 token 8 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_interview_stardust-33 token 2 -- multi-word continuation without a multi-word token range for 'We]['re'
ERROR: Sentence GUM_interview_stardust-58 token 39 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_speech_data-11 token 20 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_speech_data-17 token 17 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_speech_data-31 token 7 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_speech_floyd-33 token 35 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_speech_trump-34 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_speech_trump-37 token 3 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_speech_trump-41 token 8 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_speech_trump-42 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_speech_trump-43 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_speech_trump-44 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_speech_trump-44 token 13 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_speech_trump-45 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_speech_trump-46 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_speech_trump-47 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_speech_trump-49 token 13 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_speech_trump-52 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_speech_trump-68 token 14 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_speech_trump-71 token 9 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_speech_trump-74 token 14 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_speech_trump-84 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_speech_trump-108 token 7 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_speech_trump-109 token 7 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_textbook_artwork-4 token 16 -- multi-word continuation without a multi-word token range for '司空][图'
ERROR: Sentence GUM_textbook_artwork-5 token 17 -- multi-word continuation without a multi-word token range for '谿山][琴况'
ERROR: Sentence GUM_vlog_appearance-20 token 4 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_appearance-35 token 9 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_college-5 token 23 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_college-27 token 2 -- multi-word continuation without a multi-word token range for 'We]['re'
ERROR: Sentence GUM_vlog_college-34 token 5 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_college-37 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_vlog_college-58 token 2 -- multi-word continuation without a multi-word token range for 'We]['re'
ERROR: Sentence GUM_vlog_college-59 token 9 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_college-64 token 2 -- multi-word continuation without a multi-word token range for 'We]['re'
ERROR: Sentence GUM_vlog_college-70 token 4 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_college-81 token 7 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_college-85 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_vlog_college-88 token 3 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_college-97 token 6 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_covid-3 token 26 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_exams-84 token 4 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_hair-27 token 10 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_vlog_hair-27 token 49 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_vlog_hiking-3 token 11 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_hiking-6 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_vlog_hiking-12 token 4 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_hiking-13 token 9 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_vlog_hiking-13 token 26 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_hiking-14 token 17 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_hiking-18 token 2 -- multi-word continuation without a multi-word token range for 'We]['re'
ERROR: Sentence GUM_vlog_hiking-18 token 23 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_hiking-19 token 6 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_hiking-19 token 40 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_hiking-21 token 5 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_hiking-22 token 14 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_hiking-23 token 16 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_hiking-24 token 12 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_hiking-36 token 18 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_hiking-36 token 33 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_hiking-36 token 36 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_hiking-37 token 19 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_hiking-37 token 24 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_hiking-64 token 2 -- multi-word continuation without a multi-word token range for 'We]['re'
ERROR: Sentence GUM_vlog_hiking-70 token 4 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_hiking-70 token 21 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_lipstick-40 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_vlog_lipstick-46 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_vlog_lipstick-74 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_vlog_pizzeria-5 token 31 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_pizzeria-8 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_vlog_pizzeria-11 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_vlog_pizzeria-13 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_vlog_pizzeria-16 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_vlog_pizzeria-51 token 8 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_pizzeria-60 token 2 -- multi-word continuation without a multi-word token range for 'We]['re'
ERROR: Sentence GUM_vlog_pizzeria-60 token 7 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_pizzeria-81 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_vlog_pizzeria-102 token 10 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_pizzeria-109 token 5 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_vlog_pizzeria-112 token 2 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_vlog_pregnant-14 token 5 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_pregnant-14 token 9 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_pregnant-15 token 8 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_pregnant-31 token 14 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_pregnant-51 token 11 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_vlog_wine-10 token 19 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_wine-10 token 39 -- multi-word continuation without a multi-word token range for 'we]['re'
ERROR: Sentence GUM_vlog_wine-44 token 23 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_voyage_thailand-12 token 3 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_arrogant-6 token 8 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_whow_arrogant-20 token 4 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_whow_arrogant-23 token 19 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_whow_arrogant-36 token 9 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_whow_arrogant-37 token 12 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_whow_arrogant-48 token 21 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_whow_arrogant-58 token 3 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_whow_ballet-6 token 19 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_ballet-24 token 14 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_ballet-24 token 28 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_ballet-27 token 3 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_ballet-33 token 37 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_ballet-34 token 16 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_ballet-37 token 15 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_basil-42 token 25 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_cupcakes-7 token 2 -- multi-word continuation without a multi-word token range for 'They]['re'
ERROR: Sentence GUM_whow_cupcakes-34 token 7 -- multi-word continuation without a multi-word token range for 'o][C'
ERROR: Sentence GUM_whow_cupcakes-34 token 11 -- multi-word continuation without a multi-word token range for 'o][F'
ERROR: Sentence GUM_whow_cupcakes-50 token 22 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_whow_cupcakes-58 token 7 -- multi-word continuation without a multi-word token range for 'o][C'
ERROR: Sentence GUM_whow_cupcakes-58 token 11 -- multi-word continuation without a multi-word token range for 'o][F'
ERROR: Sentence GUM_whow_cupcakes-70 token 24 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_whow_elevator-42 token 13 -- multi-word continuation without a multi-word token range for 'You]['re'
ERROR: Sentence GUM_whow_flirt-9 token 17 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_flirt-25 token 30 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_whow_flirt-29 token 19 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_flirt-30 token 6 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_flirt-30 token 10 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_flirt-31 token 22 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_flirt-50 token 8 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_glowstick-4 token 3 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_glowstick-26 token 5 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_glowstick-27 token 3 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_glowstick-32 token 14 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_languages-40 token 3 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_procrastinating-2 token 3 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_quidditch-19 token 12 -- multi-word continuation without a multi-word token range for 'they]['re'
ERROR: Sentence GUM_whow_quidditch-65 token 3 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_quidditch-65 token 7 -- multi-word continuation without a multi-word token range for 'you]['re'
ERROR: Sentence GUM_whow_skittles-14 token 15 -- multi-word continuation without a multi-word token range for 'you]['re'
amir-zeldes commented 11 months ago

Thanks for catching - I can fix most of these, but I'm not sure why the Chinese characters are being caught here:

ERROR: Sentence GUM_textbook_artwork-4 token 16 -- multi-word continuation without a multi-word token range for '司空][图'
ERROR: Sentence GUM_textbook_artwork-5 token 17 -- multi-word continuation without a multi-word token range for '谿山][琴况'

The CoNLL-U says they are SpaceAfter=No, but that doesn't make them an English MWT, right?

rhdunn commented 11 months ago

These were detected by simple heuristics (see https://github.com/UniversalDependencies/UD_English-PUD/issues/16#issuecomment-1741391475), so the Chinese character issues are due to a false positive in the logic. (Presumably because the Chinese characters have the letter Unicode general category.)

amir-zeldes commented 11 months ago

OK, the legitimate errors should be fixed now