Closed hayesall closed 3 years ago
I usually assume examples and facts should look like this:
example(one). example(one,two).
Multiple places in the data violate this:
Blank lines:
https://github.com/srlearn/datasets/blob/084197b2d50f2d8f5674d29867a634ff9fccbe71/srlearn/uwcse/uwcse/fold2/train/train_facts.txt#L1-L4
https://github.com/srlearn/datasets/blob/084197b2d50f2d8f5674d29867a634ff9fccbe71/srlearn/uwcse/uwcse/fold3/train/train_facts.txt#L731-L734
https://github.com/srlearn/datasets/blob/084197b2d50f2d8f5674d29867a634ff9fccbe71/srlearn/uwcse/uwcse/fold3/train/train_facts.txt#L1181-L1183
Furthermore, these should probably be normalized to eliminate spaces between commas and other inconsistencies.
SRLBoost and BoostSRL derivatives allow quite a few additional symbols in the grammar (including % comments and //- comments)
%
//-
uwcse:
citeseer:
cora:
Fixed in #10
I usually assume examples and facts should look like this:
Multiple places in the data violate this:
Blank lines:
https://github.com/srlearn/datasets/blob/084197b2d50f2d8f5674d29867a634ff9fccbe71/srlearn/uwcse/uwcse/fold2/train/train_facts.txt#L1-L4
https://github.com/srlearn/datasets/blob/084197b2d50f2d8f5674d29867a634ff9fccbe71/srlearn/uwcse/uwcse/fold3/train/train_facts.txt#L731-L734
https://github.com/srlearn/datasets/blob/084197b2d50f2d8f5674d29867a634ff9fccbe71/srlearn/uwcse/uwcse/fold3/train/train_facts.txt#L1181-L1183
Furthermore, these should probably be normalized to eliminate spaces between commas and other inconsistencies.
SRLBoost and BoostSRL derivatives allow quite a few additional symbols in the grammar (including
%
comments and//-
comments)uwcse:
citeseer:
cora: