BERT - Dataset loader birectional around targets

umcu / negation-detection

Negation detection in Dutch clinical text.

GNU General Public License v3.0

3 stars 0 forks source link

BERT - Dataset loader birectional around targets #46

Open bramiozo opened 2 years ago

bramiozo commented 2 years ago

At the moment the TextDatasetFromDataFrame class collects tag/entity sequences from the start of the document until it fills the block. This is not ideal; we would rather have the block of text surrounding the target terms:

if there is one term, just center around the term
if there are more terms, center around the int(N/2) term