Open vcvpaiva opened 7 years ago
more generally: can we have the numbers as in https://github.com/own-pt/rte-sick/blob/master/expanded/conllu/all.conllu.root.txt?
Please find all unique roots in the stats folder.
@kkalouli I think it might be better if you have the part of speech for the roots. this way we can compare number of copulas with roots that are nouns/adjectives and we can see if anything that is neither is considered a root.
in any case I found these that do not seem to me to be ok. singing 34 front 11 brushing 11 dancing 9 folding 9 rock 8 silent 8 climbing 8 naked 7 drunk 7 diving 7 surround 7 parking 6 grazing 6 empty 6 biking 6 pacing 6 person 6 landing 6
For example, for "singing & noun" I found: There is no clown singing and people are not dancing. There is no clown singing. A costumed performer is singing and people are dancing. where the dependencies are not working, as "singing" should've been a verb.
Please find the new list including the pos tags in the stats folder.
In the example:
A baby is laughing .
1 A a DT Definite=Ind 2 det 2:det U 2 baby baby NN Number=Sing 4 nsubj 4:nsubj n09827683:0.3974235736313493,n09827519:0.06167156876081363,n09918554:0.02073727521829126,n09828216:0.009570097613152126,n09827363:0.0052531796421179805,n01322221:0.0032179779444162413,n00796767:0.002126327189859401|HumanBaby= 3 is be VBZ Person=3|Tense=Pres 4 aux 4:aux v02604760:0.374466717142241,v02616386:0.11868943406506324,v02655135:0.05115870391683428,v02603699:0.026794267881151218,v02749904:0.020302773581458825,v02664769:0.004231483591539528,v02620587:0.011527470971813582,v02445925:0.008782138965198173,v02697725:0.006057304179186997,v02268246:0.003126857039036189,v02614181:0.004562086622892762,v02744820:0.0025535259373098106,v02702508:9.444234663786115E-4|Entity+ 4 laughing laugh VBG 0 root v00031820:1|Laughing= 5 . Definite=Ind . 4 punct 2:det U
the root matches exactly the SUMO concept, which's a good sign that the representation is good. in general when the SUMO concept has a equals sign this is a good sign, how many do we have like this?