PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
12.17k stars 2.95k forks source link

[Question]: UIE关系抽取的识别问题 #4403

Closed soultrans closed 1 year ago

soultrans commented 1 year ago

请提出你的问题

我想请问下,调用UIE模型进行关系抽取,为什么误识别的情况比较多,识别结果中的probability也比较高?

测试例句:父亲李四与儿子李山在看电视,而且母亲韩梅正在厨房做饭,一家人其乐融融 schema:[{"人物": ['父亲', '母亲', '丈夫', '妻子', '儿子', '女儿']}] 如下识别结果中:李四的母亲识别为韩梅,probability为0.9915262462628647;李山的妻子识别为韩梅,probability为0.9984809774814245

识别结果:

[
    {
        "人物": [
            {
                "end": 4,
                "probability": 0.9999326478743775,
                "relations": {
                    "儿子": [
                        {
                            "end": 9,
                            "probability": 0.9996751775001549,
                            "start": 7,
                            "text": "李山"
                        }
                    ],
                    "女儿": [
                        {
                            "end": 9,
                            "probability": 0.9998128501670855,
                            "start": 7,
                            "text": "李山"
                        }
                    ],
                    "妻子": [
                        {
                            "end": 20,
                            "probability": 0.9998102277745851,
                            "start": 18,
                            "text": "韩梅"
                        }
                    ],
                    "母亲": [
                        {
                            "end": 20,
                            "probability": 0.9915262462628647,
                            "start": 18,
                            "text": "韩梅"
                        }
                    ]
                },
                "start": 2,
                "text": "李四"
            },
            {
                "end": 20,
                "probability": 0.9999133962620022,
                "relations": {
                    "丈夫": [
                        {
                            "end": 4,
                            "probability": 0.9997426409088916,
                            "start": 2,
                            "text": "李四"
                        }
                    ],
                    "儿子": [
                        {
                            "end": 9,
                            "probability": 0.9997661244497458,
                            "start": 7,
                            "text": "李山"
                        }
                    ],
                    "女儿": [
                        {
                            "end": 9,
                            "probability": 0.9993592294091371,
                            "start": 7,
                            "text": "李山"
                        }
                    ],
                    "父亲": [
                        {
                            "end": 4,
                            "probability": 0.9997478872504928,
                            "start": 2,
                            "text": "李四"
                        }
                    ]
                },
                "start": 18,
                "text": "韩梅"
            },
            {
                "end": 9,
                "probability": 0.9999012970810917,
                "relations": {
                    "丈夫": [
                        {
                            "end": 4,
                            "probability": 0.3257814427944865,
                            "start": 2,
                            "text": "李四"
                        }
                    ],
                    "妻子": [
                        {
                            "end": 20,
                            "probability": 0.9984809774814245,
                            "start": 18,
                            "text": "韩梅"
                        }
                    ],
                    "母亲": [
                        {
                            "end": 20,
                            "probability": 0.9998192862494051,
                            "start": 18,
                            "text": "韩梅"
                        }
                    ],
                    "父亲": [
                        {
                            "end": 4,
                            "probability": 0.9997316756284427,
                            "start": 2,
                            "text": "李四"
                        }
                    ]
                },
                "start": 7,
                "text": "李山"
            }
        ]
    }
]
linjieccc commented 1 year ago

@soultrans 试试在doccano.py数据转换时把negative_ratio设大一些