nlplab / brat

brat rapid annotation tool (brat) - for all your textual annotation needs
http://brat.nlplab.org
Other
1.83k stars 509 forks source link

Configuring the Stanford CoreNLP server as a automatic annotation service #1216

Open nimble-software opened 7 years ago

nimble-software commented 7 years ago

Hi

I am trying to get the Stanford CoreNLP server configured as an automatic annotation service, however I am getting the following error

Traceback (most recent call last): File "server/src/server.py", line 322, in serve return _safe_serve(params, client_ip, client_hostname, cookie_data) File "server/src/server.py", line 197, in _safe_serve json_dic = dispatch(http_args, client_ip, client_hostname) File "server/src/dispatch.py", line 308, in dispatch json_dic = action_function(*action_args) File "server/src/tag.py", line 149, in tag assert 'offsets' in ann_data, 'Tagger response lacks offsets' AssertionError: Tagger response lacks offsets

Brat Version: 1.3

Am I using the correct server? Or should I be using the NERServer?

Thanks in advance.

iramishtiaq commented 7 years ago

i have the same issue!

Franck-Dernoncourt commented 7 years ago

As a side note, https://github.com/Franck-Dernoncourt/NeuroNER can perform named-entity recognition on BRAT-formatted datasets.

Feynman27 commented 7 years ago

It looks like the expected data structures are inconsistent between StanfordCoreNLP's ann_data and what Brat expects.

This is a super hacky workaround, but I just commented out the necessary lines in tag() and replaced them with:

        for ann_data in json_resp.itervalues():
            for d in ann_data:
                for elem in d['tokens']:
                    start, end = elem['characterOffsetBegin'], elem['characterOffsetEnd']
                    _type = elem['ner']
                    text = elem['originalText']

                    _id = ann_obj.get_new_id('T')

                    tb = TextBoundAnnotationWithText(((start, end),), _id, _type, text)

                    mods.addition(tb)
                    ann_obj.add_annotation(tb)

This seemed to work for me.