131250208 / TPlinker-joint-extraction

438 stars 94 forks source link

Prepare my own text dataset #84

Open AnwarsaeedDMU opened 1 year ago

AnwarsaeedDMU commented 1 year ago

@131250208 I see the work it more interesting to work with, also the results on NYT and Weblng is very good. I want to apply this work to my own text data to extract triple. I search but didn't find a proper way to prepare a dataset such as NYT so I will use it for this model or other deep learning models too. The text articles I am going to prepare are like this: Cruise ship NORWEGIAN SUN hit an iceberg size of grand piano on Jun 25 off Hubbard Glacier, Gulf of Alaska, and is understood to suffer hull breach and probably, other damages. The ship called Juneau, maybe to disembark passengers, and left Juneau afterwards, bound for Victoria BC, ETA Jun 30.Small iceberg was hit in dense fog, almost no chance of spotting it visually (providing ship’s command kept lookout watch on f’castle), while all the electronic wizardry is as good as useless. A pair of human eyes, armed with binoculars, is still irreplaceable, at least under some circumstances.

manually preparing thousand of articles is really hard.