dkpro / dkpro-core

Collection of software components for natural language processing (NLP) based on the Apache UIMA framework.
https://dkpro.github.io/dkpro-core
Other
195 stars 67 forks source link

feature-request: BratReader should not require mappings for all Brat labels #1417

Closed alaindesilets closed 4 years ago

alaindesilets commented 4 years ago

It would be very useful if BratReader was able to read .ann files that contain labels for which there is no known UIMA type mapping. These unknown labels would just get mapped to a generic type called, say, GenericLabel. Instances of GenericLabel would have a method getLabel() which would return the Brat label.

I am prepared to do this if the dev team is OK with it.

reckart commented 4 years ago

The from field in the mappings which is used to match against the brat type is treated as a regular expression. So if you build a mapping like this, I think it should work:


'textTypeMapppings': [
  {
    'from': 'known_brat_label',
    'to': 'my.company.KnownUimaType'
  },
  {
    'from': '.*', // <= match any brat label not matched before
    'to': 'my.company.GenericUimaType', 
    'subCatFeature': 'label' // <= store brat label into this feature
  }
]
alaindesilets commented 4 years ago

Thx Rickart.

In the above, what mappings are you referring to? Can you provide a code sample that shows how to actually activate this kind of mapping?

Thx.

reckart commented 4 years ago

See https://github.com/dkpro/dkpro-core/blob/09065ee18779189c00a68034480e95449f27fb91/dkpro-core-io-brat-asl/src/test/java/org/dkpro/core/io/brat/BratReaderWriterTest.java#L181-L247

reckart commented 4 years ago

@alaindesilets Did it work for you?

reckart commented 4 years ago

I assume it worked and no further action is required.

alaindesilets commented 4 years ago

Yes, it worked. Thx.

On Fri, Nov 15, 2019 at 1:17 PM Richard Eckart de Castilho < notifications@github.com> wrote:

I assume it worked and no further action is required.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dkpro/dkpro-core/issues/1417?email_source=notifications&email_token=AAIMA4CMBM7BMHI5JZO4VPDQT3RVFA5CNFSM4I6QAYY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEGJABQ#issuecomment-554471430, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIMA4BIRJKZAC74RXTVKL3QT3RVFANCNFSM4I6QAYYQ .