propbank / propbank-release

The official released annotations, both in .prop pointer format and as conll files. Does not contain the source texts
Creative Commons Attribution Share Alike 4.0 International
133 stars 13 forks source link

missing data from Ontonotes 5.0 #9

Closed arademaker closed 2 years ago

arademaker commented 4 years ago

Comparing the Ontonotes 5.0 *.parse files and the *.gold_skel files from the data/ontonotes/ directory, I found 5,897 missing .gold_skel files. All the missing files are from the data/ontonotes/wb/ directory. Below the list of the number of missing files per subdirectory:

   1 wb/sel/03
   1 wb/sel/04
   2 wb/sel/05
  28 wb/sel/09
   1 wb/sel/10
  20 wb/sel/11
   1 wb/sel/18
  39 wb/sel/22
  55 wb/sel/23
  37 wb/sel/24
  43 wb/sel/25
  62 wb/sel/26
  51 wb/sel/27
  44 wb/sel/28
  34 wb/sel/29
  40 wb/sel/30
  51 wb/sel/31
  48 wb/sel/32
  58 wb/sel/33
  52 wb/sel/34
  58 wb/sel/35
  71 wb/sel/36
  88 wb/sel/37
  81 wb/sel/38
  94 wb/sel/39
  97 wb/sel/40
  99 wb/sel/41
  79 wb/sel/42
  85 wb/sel/43
  97 wb/sel/44
  99 wb/sel/45
  76 wb/sel/46
  54 wb/sel/47
  78 wb/sel/48
  93 wb/sel/49
  66 wb/sel/50
  96 wb/sel/51
  96 wb/sel/52
  83 wb/sel/53
  42 wb/sel/54
  31 wb/sel/55
  93 wb/sel/56
  90 wb/sel/57
  93 wb/sel/58
  58 wb/sel/59
  82 wb/sel/60
  93 wb/sel/61
  94 wb/sel/62
  96 wb/sel/63
  89 wb/sel/64
  98 wb/sel/65
  98 wb/sel/66
  83 wb/sel/67
  79 wb/sel/68
  95 wb/sel/69
  94 wb/sel/70
  95 wb/sel/71
  90 wb/sel/72
  94 wb/sel/73
  97 wb/sel/74
  92 wb/sel/75
  82 wb/sel/76
  93 wb/sel/77
  92 wb/sel/78
  96 wb/sel/79
 100 wb/sel/80
  99 wb/sel/81
  88 wb/sel/82
  84 wb/sel/83
  77 wb/sel/84
  40 wb/sel/85
  78 wb/sel/86
 100 wb/sel/87
  96 wb/sel/88
  99 wb/sel/89
  56 wb/sel/90
  46 wb/sel/91
  57 wb/sel/92
  56 wb/sel/93
  84 wb/sel/94
  69 wb/sel/95
  59 wb/sel/96
  79 wb/sel/97
  33 wb/sel/98

Do we have any reason for not having these files in the propbank release?

arademaker commented 2 years ago

this is also related to #2 and #13