HumanSignal / label-studio-ml-backend

Configs and boilerplates for Label Studio's Machine Learning backend
Apache License 2.0
562 stars 259 forks source link

example->ner->ner.py always returns text='...' and start=0 #194

Open gitgoready opened 1 year ago

gitgoready commented 1 year ago

I use the ml-backend and the pretrained bert-base-chinese model to train a NER model,The model is trained ok, but when I use it to predict, it always returns text='...' and start=0, in fact, the string doesn't contain any '...' at all. When i look into the code, it shows the return text is set to '...'

What's wrong?

Any help is appreciated!

gitgoready commented 1 year ago
            for label, group in groupby(zip(preds, starts, scores), key=lambda i: re.sub('^(B-|I-)', '', i[0])):
                _, group_start, _ = list(group)[0]
                if len(result) > 0:
                    if group_start == 0:
                        result.pop(-1)
                    else:
                        result[-1]['value']['end'] = group_start - 1
                if label != 'O':
                    result.append({
                        'from_name': from_name,
                        'to_name': to_name,
                        'type': 'labels',
                        'value': {
                            'labels': [label],
                            'start': group_start,
                            'end': None, 
                            'text': '...'
                        }
                    })
KonstantinKorotaev commented 1 year ago

Hi @gitgoready You can extrat data from your task directly:

def predict(self, tasks, **kwargs):

Extract data from tasks and place the text from there.