Open vcvpaiva opened 8 years ago
not clear what is the proposal for this issue.
it is clear to me that the most important job we do when completing the openwn-pt is to make English synsets, non-empty in Portuguese. this is the real gain in information. so discovering when this happens would be a major advancement. but we also need to be able to mark the synsets that we think (even if later on we may change our mind) that do not exist in Portuguese. Because this is the termination condition for this project. Within one year or two we need ot be able to say PWN has 117K synsets. OWN-PT has 75.6K synsets (this is a chute) of which 24.883 are only Portuguese synsets.
Don't know how difficult this is, but since we got the gentillics accepted, it would be good to know how many are now empty and beocming non-empty when we vote. there are also lots of capitals and non-capital repetitions that we're trying to remove, maybe we can do a hackthon to get rid of the 5000 in this case.