PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
12.13k stars 2.94k forks source link

关于4.3节模型微调使用英文数据进行训练的问题 #3345

Closed TSAIJK closed 1 year ago

TSAIJK commented 2 years ago

问题描述

在4.3 模型微调部分 model: 选择模型,程序会基于选择的模型进行模型微调,可选有uie-base, uie-medium, uie-mini, uie-micro和uie-nano,默认为uie-base。 请问如果此时需要对英文数据进行训练,可否仍使用uie-base进行训练?(因为参照3.7节的讲解,base和base-en是有区别的,但是finetune.py处的代码尚未更新,不支持uie-base-en) 谢谢回答!

linjieccc commented 2 years ago

你好,这部分功能目前还未合入develop分支,可以先参考PR #3227 修改下

TSAIJK commented 2 years ago

请教下,在uie-base-en 的基础上定制训练,参考了PR #3227 修改,出现下面这样的报错应该如何处理呢? ++:初步判断是doccano.py出了问题(因使用已发布版本doccano.py划分的数据集进行finetune并未报错,说明数据格式正确且与后续代码能匹配)

(paddleuie_py38) [cjk@server02 02TryDevelopEnglish]$ python doccano.py --doccano_file ./data/doccano_ext.json --save_dir ./data --splits 0.8 0.2 0 --task_type ext --schema_lang en /home/cjk/anaconda3/envs/paddleuie_py38/lib/python3.8/site-packages/paddlenlp/transformers/image_utils.py:213: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead. resample=Image.BILINEAR, /home/cjk/anaconda3/envs/paddleuie_py38/lib/python3.8/site-packages/paddlenlp/transformers/image_utils.py:379: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead. resample=Image.NEAREST, /home/cjk/anaconda3/envs/paddleuie_py38/lib/python3.8/site-packages/paddlenlp/transformers/ernie_vil/feature_extraction.py:65: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. resample=Image.BICUBIC, /home/cjk/anaconda3/envs/paddleuie_py38/lib/python3.8/site-packages/paddlenlp/transformers/clip/feature_extraction.py:64: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. resample=Image.BICUBIC, [2022-09-23 10:29:58,203] [ INFO] - Converting doccano data... 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 15238.16it/s] [2022-09-23 10:29:58,204] [ INFO] - Adding negative samples for first stage prompt... 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 126144.48it/s] [2022-09-23 10:29:58,205] [ INFO] - Adding negative samples for second stage prompt... 0%| | 0/4 [00:00<?, ?it/s] Traceback (most recent call last): File "doccano.py", line 179, in do_convert() File "doccano.py", line 121, in do_convert train_examples = _create_ext_examples(raw_examples[:p1], File "doccano.py", line 60, in _create_ext_examples entities, relations, aspects = convert_ext_examples( File "/home/cjk/CodeStore/test202207/02TryDevelopEnglish/utils.py", line 696, in convert_ext_examples redundants2 = [ File "/home/cjk/CodeStore/test202207/02TryDevelopEnglish/utils.py", line 697, in predicate_list[i][random.randrange( File "/home/cjk/anaconda3/envs/paddleuie_py38/lib/python3.8/random.py", line 216, in randrange raise ValueError("empty range for randrange()") ValueError: empty range for randrange()

linjieccc commented 2 years ago

@TSAIJK 这里需要加一个对实体和关系类别是否为空的判断,可以参考这个commit 更新一下代码 https://github.com/linjieccc/PaddleNLP/commit/6b83bd72f5a62d25298995e148b6c95688e1885c

PanZheng-2021 commented 2 years ago

在用uie-base-en微调的时候,准确率、召回率和F1都很低,大概是0.1到0.2 image image

以下是训练数据集:

{"content": "Breakfast was also was a perfect simple execution with fresh fruit", "result_list": [{"text": "perfect ", "start": 25, "end": 33}], "prompt": "Opinoin"} {"content": "Attentive staff and decent rooms.", "result_list": [{"text": "Attentive", "start": 0, "end": 9}], "prompt": "Opinoin"} {"content": "The lobby was huge, lots of seating, but not cozy.", "result_list": [{"text": "lobby", "start": 4, "end": 9}], "prompt": "Aspect"} {"content": "New shower with marble tile and high shower head for tall people", "result_list": [{"text": "New ", "start": 0, "end": 4}], "prompt": "Opinoin"} {"content": "Location is good", "result_list": [{"text": "good", "start": 12, "end": 16}], "prompt": "Opinoin"} {"content": "Breakfast was very good with enough healthier options", "result_list": [{"text": "very good", "start": 14, "end": 23}], "prompt": "Opinoin"} {"content": "A small cafe is available for breakfast next door", "result_list": [{"text": "cafe ", "start": 8, "end": 13}], "prompt": "Aspect"} {"content": "Great convenient location with easy access to Chinatown and Fisherman’s Wharf.", "result_list": [{"text": "Great convenient", "start": 0, "end": 16}], "prompt": "Opinoin"} {"content": "This hotel is perfectly located for Fisherman's Wharf", "result_list": [{"text": "perfectly located", "start": 14, "end": 31}], "prompt": "Opinoin"} {"content": "The neighborhood was bad, lots of crime every where.", "result_list": [{"text": "neighborhood ", "start": 4, "end": 17}], "prompt": "Aspect"} {"content": "Location is good", "result_list": [{"text": "Location ", "start": 0, "end": 9}], "prompt": "Aspect"} {"content": "The location is great, its a very convenient area right in the middle of everything.", "result_list": [{"text": "location ", "start": 4, "end": 13}], "prompt": "Aspect"} {"content": "They said all the bed is soft the only way is to add a hard bord under the mattress.", "result_list": [{"text": "bed ", "start": 18, "end": 22}], "prompt": "Aspect"} {"content": "The hotel is very central,", "result_list": [{"text": "hotel ", "start": 4, "end": 10}], "prompt": "Aspect"} {"content": "There were many nice restaurants", "result_list": [{"text": "restaurants", "start": 21, "end": 32}], "prompt": "Aspect"} {"content": "The rooms are VERY small", "result_list": [{"text": "rooms ", "start": 4, "end": 10}], "prompt": "Aspect"} {"content": "Great convenient location with easy access to Chinatown and Fisherman’s Wharf.", "result_list": [{"text": "location ", "start": 17, "end": 26}], "prompt": "Aspect"} {"content": "staff very friendly", "result_list": [{"text": "staff ", "start": 0, "end": 6}], "prompt": "Aspect"} {"content": "Rooms were comfortable", "result_list": [{"text": "Rooms ", "start": 0, "end": 6}], "prompt": "Aspect"} {"content": "beds were great", "result_list": [{"text": "beds ", "start": 0, "end": 5}], "prompt": "Aspect"} {"content": "serves good Italian food and a decent breakfast is reasonable prices.", "result_list": [{"text": "serves good Italian food and a decent breakfast", "start": 0, "end": 47}], "prompt": "Aspect"} {"content": "They said all the bed is soft the only way is to add a hard bord under the mattress.", "result_list": [{"text": "soft ", "start": 25, "end": 30}], "prompt": "Opinoin"} {"content": "liked the eclectic appointed lobby with its charming decor.", "result_list": [{"text": "eclectic appointed", "start": 10, "end": 28}], "prompt": "Opinoin"} {"content": "beds were great", "result_list": [{"text": "great", "start": 10, "end": 15}], "prompt": "Opinoin"} {"content": "Location was excellent", "result_list": [{"text": "Location ", "start": 0, "end": 9}], "prompt": "Aspect"} {"content": "I love this hotel,", "result_list": [{"text": "I", "start": 0, "end": 1}], "prompt": "Aspect"} {"content": "I stayed there", "result_list": [{"text": "stayed there", "start": 2, "end": 14}], "prompt": "Opinoin"} {"content": "The hotel is well located for all the nearby attractions", "result_list": [{"text": "hotel ", "start": 4, "end": 10}], "prompt": "Aspect"} {"content": "Property was very nice, rooms were amazing, staff was friendly.", "result_list": [{"text": "very nice", "start": 13, "end": 22}], "prompt": "Opinoin"} {"content": "Check in was easy", "result_list": [{"text": "Check in", "start": 0, "end": 8}], "prompt": "Aspect"} {"content": "Check in was easy", "result_list": [{"text": "easy", "start": 13, "end": 17}], "prompt": "Opinoin"} {"content": "I got a great rate through hotels.", "result_list": [{"text": "I", "start": 0, "end": 1}], "prompt": "Aspect"} {"content": "The location is perfectly situated for sightseeing, easily to be found and not too far from the airport.", "result_list": [{"text": "perfectly situated", "start": 16, "end": 34}], "prompt": "Opinoin"} {"content": "the breakfast is just a choice", "result_list": [{"text": "a choice", "start": 22, "end": 30}], "prompt": "Opinoin"} {"content": "staff very friendly", "result_list": [{"text": "very friendly", "start": 6, "end": 19}], "prompt": "Opinoin"} {"content": "Rooms were comfortable", "result_list": [{"text": "comfortable", "start": 11, "end": 22}], "prompt": "Opinoin"} {"content": "The hotel close to the transport services.", "result_list": [{"text": "hotel", "start": 4, "end": 9}], "prompt": "Aspect"} {"content": "Breakfast was also was a perfect simple execution with fresh fruit", "result_list": [{"text": "Breakfast ", "start": 0, "end": 10}], "prompt": "Aspect"} {"content": "The rooms are VERY small", "result_list": [{"text": "VERY small", "start": 14, "end": 24}], "prompt": "Opinoin"} {"content": "Hallway is narrow and very low ceiling", "result_list": [{"text": "Hallway ", "start": 0, "end": 8}], "prompt": "Aspect"} {"content": "New shower with marble tile and high shower head for tall people", "result_list": [{"text": "shower ", "start": 4, "end": 11}], "prompt": "Aspect"} {"content": "The room was very good, plenty of room, did not have that cramped in feeling.", "result_list": [{"text": "very good", "start": 13, "end": 22}], "prompt": "Opinoin"} {"content": "The room was very good, plenty of room, did not have that cramped in feeling.", "result_list": [{"text": "room ", "start": 4, "end": 9}], "prompt": "Aspect"} {"content": "There were many nice restaurants", "result_list": [{"text": "nice", "start": 16, "end": 20}], "prompt": "Opinoin"} {"content": "Although this hotel is conveniently located being right at the midst of Chinatown, the room is terribly small.", "result_list": [{"text": "conveniently located", "start": 23, "end": 43}], "prompt": "Opinoin"} {"content": "The staff was so rigid on following policies that they had really poor", "result_list": [{"text": "so rigid", "start": 14, "end": 22}], "prompt": "Opinoin"} {"content": "liked the eclectic appointed lobby with its charming decor.", "result_list": [{"text": "lobby ", "start": 29, "end": 35}], "prompt": "Aspect"} {"content": "The staff was so rigid on following policies that they had really poor", "result_list": [{"text": "staff ", "start": 4, "end": 10}], "prompt": "Aspect"} {"content": "Attentive staff and decent rooms.", "result_list": [{"text": "staff ", "start": 10, "end": 16}], "prompt": "Aspect"} {"content": "The location is great, its a very convenient area right in the middle of everything.", "result_list": [{"text": "great", "start": 16, "end": 21}], "prompt": "Opinoin"} {"content": "Breakfast was very good with enough healthier options", "result_list": [{"text": "Breakfast ", "start": 0, "end": 10}], "prompt": "Aspect"} {"content": "Large televisions with great channel selection.", "result_list": [{"text": "great channel selection", "start": 23, "end": 46}], "prompt": "Opinoin"} {"content": "Restaurant is attached to hotel", "result_list": [{"text": "Restaurant ", "start": 0, "end": 11}], "prompt": "Aspect"} {"content": "Rooms are furnished nicely", "result_list": [{"text": "furnished nicely", "start": 10, "end": 26}], "prompt": "Opinoin"} {"content": "Hallway is narrow and very low ceiling", "result_list": [{"text": "narrow and very low ceiling", "start": 11, "end": 38}], "prompt": "Opinoin"} {"content": "hotel staff were friendly and helpful", "result_list": [{"text": "friendly ", "start": 17, "end": 26}], "prompt": "Opinoin"} {"content": "The hotel is very central,", "result_list": [{"text": "very central", "start": 13, "end": 25}], "prompt": "Opinoin"} {"content": "Rooms are furnished nicely", "result_list": [{"text": "Rooms ", "start": 0, "end": 6}], "prompt": "Aspect"} {"content": "a nice option for breakfast", "result_list": [{"text": "breakfast", "start": 18, "end": 27}], "prompt": "Aspect"} {"content": "The lobby was huge, lots of seating, but not cozy.", "result_list": [{"text": "huge", "start": 14, "end": 18}], "prompt": "Opinoin"} {"content": "The neighborhood was bad, lots of crime every where.", "result_list": [{"text": "bad", "start": 21, "end": 24}], "prompt": "Opinoin"} {"content": "This hotel is perfectly located for Fisherman's Wharf", "result_list": [{"text": "hotel ", "start": 5, "end": 11}], "prompt": "Aspect"} {"content": "I stayed there", "result_list": [{"text": "I", "start": 0, "end": 1}], "prompt": "Aspect"} {"content": "Although this hotel is conveniently located being right at the midst of Chinatown, the room is terribly small.", "result_list": [{"text": "hotel ", "start": 14, "end": 20}], "prompt": "Aspect"} {"content": "the breakfast is just a choice", "result_list": [{"text": "breakfast ", "start": 4, "end": 14}], "prompt": "Aspect"} {"content": "The hotel is well located for all the nearby attractions", "result_list": [{"text": "well located", "start": 13, "end": 25}], "prompt": "Opinoin"} {"content": "I got a great rate through hotels.", "result_list": [{"text": "great", "start": 8, "end": 13}], "prompt": "Opinoin"} {"content": "Location was excellent", "result_list": [{"text": "excellent", "start": 13, "end": 22}], "prompt": "Opinoin"} {"content": "serves good Italian food and a decent breakfast is reasonable prices.", "result_list": [{"text": "reasonable prices", "start": 51, "end": 68}], "prompt": "Opinoin"} {"content": "it's easy to access the city from there.", "result_list": [{"text": "it", "start": 0, "end": 2}], "prompt": "Aspect"} {"content": "Property was very nice, rooms were amazing, staff was friendly.", "result_list": [{"text": "Property ", "start": 0, "end": 9}], "prompt": "Aspect"} {"content": "a nice option for breakfast", "result_list": [{"text": "nice option", "start": 2, "end": 13}], "prompt": "Opinoin"} {"content": "The hotel close to the transport services.", "result_list": [{"text": "close", "start": 10, "end": 15}], "prompt": "Opinoin"} {"content": "The location is perfectly situated for sightseeing, easily to be found and not too far from the airport.", "result_list": [{"text": "location ", "start": 4, "end": 13}], "prompt": "Aspect"} {"content": "Large televisions with great channel selection.", "result_list": [{"text": "televisions ", "start": 6, "end": 18}], "prompt": "Aspect"} {"content": "hotel staff were friendly and helpful", "result_list": [{"text": "staff ", "start": 6, "end": 12}], "prompt": "Aspect"} {"content": "A small cafe is available for breakfast next door", "result_list": [{"text": "small ", "start": 2, "end": 8}], "prompt": "Opinoin"} {"content": "I love this hotel,", "result_list": [{"text": "love this hotel", "start": 2, "end": 17}], "prompt": "Opinoin"} {"content": "Restaurant is attached to hotel", "result_list": [{"text": "attached", "start": 14, "end": 22}], "prompt": "Opinoin"} {"content": "it's easy to access the city from there.", "result_list": [{"text": "easy ", "start": 5, "end": 10}], "prompt": "Opinoin"}

TSAIJK commented 2 years ago

在用uie-base-en微调的时候,准确率、召回率和F1都很低,大概是0.1到0.2 image image

以下是训练数据集:

{"content": "Breakfast was also was a perfect simple execution with fresh fruit", "result_list": [{"text": "perfect ", "start": 25, "end": 33}], "prompt": "Opinoin"} {"content": "Attentive staff and decent rooms.", "result_list": [{"text": "Attentive", "start": 0, "end": 9}], "prompt": "Opinoin"} {"content": "The lobby was huge, lots of seating, but not cozy.", "result_list": [{"text": "lobby", "start": 4, "end": 9}], "prompt": "Aspect"} {"content": "New shower with marble tile and high shower head for tall people", "result_list": [{"text": "New ", "start": 0, "end": 4}], "prompt": "Opinoin"} {"content": "Location is good", "result_list": [{"text": "good", "start": 12, "end": 16}], "prompt": "Opinoin"} {"content": "Breakfast was very good with enough healthier options", "result_list": [{"text": "very good", "start": 14, "end": 23}], "prompt": "Opinoin"} {"content": "A small cafe is available for breakfast next door", "result_list": [{"text": "cafe ", "start": 8, "end": 13}], "prompt": "Aspect"} {"content": "Great convenient location with easy access to Chinatown and Fisherman’s Wharf.", "result_list": [{"text": "Great convenient", "start": 0, "end": 16}], "prompt": "Opinoin"} {"content": "This hotel is perfectly located for Fisherman's Wharf", "result_list": [{"text": "perfectly located", "start": 14, "end": 31}], "prompt": "Opinoin"} {"content": "The neighborhood was bad, lots of crime every where.", "result_list": [{"text": "neighborhood ", "start": 4, "end": 17}], "prompt": "Aspect"} {"content": "Location is good", "result_list": [{"text": "Location ", "start": 0, "end": 9}], "prompt": "Aspect"} {"content": "The location is great, its a very convenient area right in the middle of everything.", "result_list": [{"text": "location ", "start": 4, "end": 13}], "prompt": "Aspect"} {"content": "They said all the bed is soft the only way is to add a hard bord under the mattress.", "result_list": [{"text": "bed ", "start": 18, "end": 22}], "prompt": "Aspect"} {"content": "The hotel is very central,", "result_list": [{"text": "hotel ", "start": 4, "end": 10}], "prompt": "Aspect"} {"content": "There were many nice restaurants", "result_list": [{"text": "restaurants", "start": 21, "end": 32}], "prompt": "Aspect"} {"content": "The rooms are VERY small", "result_list": [{"text": "rooms ", "start": 4, "end": 10}], "prompt": "Aspect"} {"content": "Great convenient location with easy access to Chinatown and Fisherman’s Wharf.", "result_list": [{"text": "location ", "start": 17, "end": 26}], "prompt": "Aspect"} {"content": "staff very friendly", "result_list": [{"text": "staff ", "start": 0, "end": 6}], "prompt": "Aspect"} {"content": "Rooms were comfortable", "result_list": [{"text": "Rooms ", "start": 0, "end": 6}], "prompt": "Aspect"} {"content": "beds were great", "result_list": [{"text": "beds ", "start": 0, "end": 5}], "prompt": "Aspect"} {"content": "serves good Italian food and a decent breakfast is reasonable prices.", "result_list": [{"text": "serves good Italian food and a decent breakfast", "start": 0, "end": 47}], "prompt": "Aspect"} {"content": "They said all the bed is soft the only way is to add a hard bord under the mattress.", "result_list": [{"text": "soft ", "start": 25, "end": 30}], "prompt": "Opinoin"} {"content": "liked the eclectic appointed lobby with its charming decor.", "result_list": [{"text": "eclectic appointed", "start": 10, "end": 28}], "prompt": "Opinoin"} {"content": "beds were great", "result_list": [{"text": "great", "start": 10, "end": 15}], "prompt": "Opinoin"} {"content": "Location was excellent", "result_list": [{"text": "Location ", "start": 0, "end": 9}], "prompt": "Aspect"} {"content": "I love this hotel,", "result_list": [{"text": "I", "start": 0, "end": 1}], "prompt": "Aspect"} {"content": "I stayed there", "result_list": [{"text": "stayed there", "start": 2, "end": 14}], "prompt": "Opinoin"} {"content": "The hotel is well located for all the nearby attractions", "result_list": [{"text": "hotel ", "start": 4, "end": 10}], "prompt": "Aspect"} {"content": "Property was very nice, rooms were amazing, staff was friendly.", "result_list": [{"text": "very nice", "start": 13, "end": 22}], "prompt": "Opinoin"} {"content": "Check in was easy", "result_list": [{"text": "Check in", "start": 0, "end": 8}], "prompt": "Aspect"} {"content": "Check in was easy", "result_list": [{"text": "easy", "start": 13, "end": 17}], "prompt": "Opinoin"} {"content": "I got a great rate through hotels.", "result_list": [{"text": "I", "start": 0, "end": 1}], "prompt": "Aspect"} {"content": "The location is perfectly situated for sightseeing, easily to be found and not too far from the airport.", "result_list": [{"text": "perfectly situated", "start": 16, "end": 34}], "prompt": "Opinoin"} {"content": "the breakfast is just a choice", "result_list": [{"text": "a choice", "start": 22, "end": 30}], "prompt": "Opinoin"} {"content": "staff very friendly", "result_list": [{"text": "very friendly", "start": 6, "end": 19}], "prompt": "Opinoin"} {"content": "Rooms were comfortable", "result_list": [{"text": "comfortable", "start": 11, "end": 22}], "prompt": "Opinoin"} {"content": "The hotel close to the transport services.", "result_list": [{"text": "hotel", "start": 4, "end": 9}], "prompt": "Aspect"} {"content": "Breakfast was also was a perfect simple execution with fresh fruit", "result_list": [{"text": "Breakfast ", "start": 0, "end": 10}], "prompt": "Aspect"} {"content": "The rooms are VERY small", "result_list": [{"text": "VERY small", "start": 14, "end": 24}], "prompt": "Opinoin"} {"content": "Hallway is narrow and very low ceiling", "result_list": [{"text": "Hallway ", "start": 0, "end": 8}], "prompt": "Aspect"} {"content": "New shower with marble tile and high shower head for tall people", "result_list": [{"text": "shower ", "start": 4, "end": 11}], "prompt": "Aspect"} {"content": "The room was very good, plenty of room, did not have that cramped in feeling.", "result_list": [{"text": "very good", "start": 13, "end": 22}], "prompt": "Opinoin"} {"content": "The room was very good, plenty of room, did not have that cramped in feeling.", "result_list": [{"text": "room ", "start": 4, "end": 9}], "prompt": "Aspect"} {"content": "There were many nice restaurants", "result_list": [{"text": "nice", "start": 16, "end": 20}], "prompt": "Opinoin"} {"content": "Although this hotel is conveniently located being right at the midst of Chinatown, the room is terribly small.", "result_list": [{"text": "conveniently located", "start": 23, "end": 43}], "prompt": "Opinoin"} {"content": "The staff was so rigid on following policies that they had really poor", "result_list": [{"text": "so rigid", "start": 14, "end": 22}], "prompt": "Opinoin"} {"content": "liked the eclectic appointed lobby with its charming decor.", "result_list": [{"text": "lobby ", "start": 29, "end": 35}], "prompt": "Aspect"} {"content": "The staff was so rigid on following policies that they had really poor", "result_list": [{"text": "staff ", "start": 4, "end": 10}], "prompt": "Aspect"} {"content": "Attentive staff and decent rooms.", "result_list": [{"text": "staff ", "start": 10, "end": 16}], "prompt": "Aspect"} {"content": "The location is great, its a very convenient area right in the middle of everything.", "result_list": [{"text": "great", "start": 16, "end": 21}], "prompt": "Opinoin"} {"content": "Breakfast was very good with enough healthier options", "result_list": [{"text": "Breakfast ", "start": 0, "end": 10}], "prompt": "Aspect"} {"content": "Large televisions with great channel selection.", "result_list": [{"text": "great channel selection", "start": 23, "end": 46}], "prompt": "Opinoin"} {"content": "Restaurant is attached to hotel", "result_list": [{"text": "Restaurant ", "start": 0, "end": 11}], "prompt": "Aspect"} {"content": "Rooms are furnished nicely", "result_list": [{"text": "furnished nicely", "start": 10, "end": 26}], "prompt": "Opinoin"} {"content": "Hallway is narrow and very low ceiling", "result_list": [{"text": "narrow and very low ceiling", "start": 11, "end": 38}], "prompt": "Opinoin"} {"content": "hotel staff were friendly and helpful", "result_list": [{"text": "friendly ", "start": 17, "end": 26}], "prompt": "Opinoin"} {"content": "The hotel is very central,", "result_list": [{"text": "very central", "start": 13, "end": 25}], "prompt": "Opinoin"} {"content": "Rooms are furnished nicely", "result_list": [{"text": "Rooms ", "start": 0, "end": 6}], "prompt": "Aspect"} {"content": "a nice option for breakfast", "result_list": [{"text": "breakfast", "start": 18, "end": 27}], "prompt": "Aspect"} {"content": "The lobby was huge, lots of seating, but not cozy.", "result_list": [{"text": "huge", "start": 14, "end": 18}], "prompt": "Opinoin"} {"content": "The neighborhood was bad, lots of crime every where.", "result_list": [{"text": "bad", "start": 21, "end": 24}], "prompt": "Opinoin"} {"content": "This hotel is perfectly located for Fisherman's Wharf", "result_list": [{"text": "hotel ", "start": 5, "end": 11}], "prompt": "Aspect"} {"content": "I stayed there", "result_list": [{"text": "I", "start": 0, "end": 1}], "prompt": "Aspect"} {"content": "Although this hotel is conveniently located being right at the midst of Chinatown, the room is terribly small.", "result_list": [{"text": "hotel ", "start": 14, "end": 20}], "prompt": "Aspect"} {"content": "the breakfast is just a choice", "result_list": [{"text": "breakfast ", "start": 4, "end": 14}], "prompt": "Aspect"} {"content": "The hotel is well located for all the nearby attractions", "result_list": [{"text": "well located", "start": 13, "end": 25}], "prompt": "Opinoin"} {"content": "I got a great rate through hotels.", "result_list": [{"text": "great", "start": 8, "end": 13}], "prompt": "Opinoin"} {"content": "Location was excellent", "result_list": [{"text": "excellent", "start": 13, "end": 22}], "prompt": "Opinoin"} {"content": "serves good Italian food and a decent breakfast is reasonable prices.", "result_list": [{"text": "reasonable prices", "start": 51, "end": 68}], "prompt": "Opinoin"} {"content": "it's easy to access the city from there.", "result_list": [{"text": "it", "start": 0, "end": 2}], "prompt": "Aspect"} {"content": "Property was very nice, rooms were amazing, staff was friendly.", "result_list": [{"text": "Property ", "start": 0, "end": 9}], "prompt": "Aspect"} {"content": "a nice option for breakfast", "result_list": [{"text": "nice option", "start": 2, "end": 13}], "prompt": "Opinoin"} {"content": "The hotel close to the transport services.", "result_list": [{"text": "close", "start": 10, "end": 15}], "prompt": "Opinoin"} {"content": "The location is perfectly situated for sightseeing, easily to be found and not too far from the airport.", "result_list": [{"text": "location ", "start": 4, "end": 13}], "prompt": "Aspect"} {"content": "Large televisions with great channel selection.", "result_list": [{"text": "televisions ", "start": 6, "end": 18}], "prompt": "Aspect"} {"content": "hotel staff were friendly and helpful", "result_list": [{"text": "staff ", "start": 6, "end": 12}], "prompt": "Aspect"} {"content": "A small cafe is available for breakfast next door", "result_list": [{"text": "small ", "start": 2, "end": 8}], "prompt": "Opinoin"} {"content": "I love this hotel,", "result_list": [{"text": "love this hotel", "start": 2, "end": 17}], "prompt": "Opinoin"} {"content": "Restaurant is attached to hotel", "result_list": [{"text": "attached", "start": 14, "end": 22}], "prompt": "Opinoin"} {"content": "it's easy to access the city from there.", "result_list": [{"text": "easy ", "start": 5, "end": 10}], "prompt": "Opinoin"}

我的最佳F1 是0.34483,目前思路是增加标注的数量再行实验。

TSAIJK commented 2 years ago

@TSAIJK 这里需要加一个对实体和关系类别是否为空的判断,可以参考这个commit 更新一下代码 linjieccc@6b83bd7

谢谢 经过实验,一直到finetune并未报错,但是在evaluate阶段出现以下错误,貌似和pretrained model相关? [2022-09-30 11:27:35,175] [ INFO] - We are using <class 'paddlenlp.transformers.ernie.tokenizer.ErnieTokenizer'> to load './checkpoint/model_best'. Traceback (most recent call last): File "evaluate.py", line 117, in do_eval() File "evaluate.py", line 62, in do_eval model = UIEM.from_pretrained(args.model_path) File "/home/cjk/anaconda3/envs/paddleuie_py38/lib/python3.8/site-packages/paddlenlp/transformers/model_utils.py", line 316, in from_pretrained assert arg.pop( AssertionError: pretrained base model should be ErnieMModel

LemonNoel commented 2 years ago

请问finetune的时候用的模型是什么类呢,看起来是load的模型结构不一致

TSAIJK commented 2 years ago

请问finetune的时候用的模型是什么类呢,看起来是load的模型结构不一致

使用的是uie-base-en。在您的提醒后检查中发现一点问题,应该是我理解有误:在参看3227进行修改后的finetune时,重点关注了模型结构,注意到第49-54行,我本次运行的模型叫uie-base-en ,multilingual这里对应是false,而在修改后的readme.md文件中默认是 multilingual,应该是这二者不对应导致错误。现在进行实验 验证这个判断

jack-gits commented 2 years ago

在用uie-base-en微调的时候,准确率、召回率和F1都很低,大概是0.1到0.2 image image 以下是训练数据集: {"content": "Breakfast was also was a perfect simple execution with fresh fruit", "result_list": [{"text": "perfect ", "start": 25, "end": 33}], "prompt": "Opinoin"} {"content": "Attentive staff and decent rooms.", "result_list": [{"text": "Attentive", "start": 0, "end": 9}], "prompt": "Opinoin"} {"content": "The lobby was huge, lots of seating, but not cozy.", "result_list": [{"text": "lobby", "start": 4, "end": 9}], "prompt": "Aspect"} {"content": "New shower with marble tile and high shower head for tall people", "result_list": [{"text": "New ", "start": 0, "end": 4}], "prompt": "Opinoin"} {"content": "Location is good", "result_list": [{"text": "good", "start": 12, "end": 16}], "prompt": "Opinoin"} {"content": "Breakfast was very good with enough healthier options", "result_list": [{"text": "very good", "start": 14, "end": 23}], "prompt": "Opinoin"} {"content": "A small cafe is available for breakfast next door", "result_list": [{"text": "cafe ", "start": 8, "end": 13}], "prompt": "Aspect"} {"content": "Great convenient location with easy access to Chinatown and Fisherman’s Wharf.", "result_list": [{"text": "Great convenient", "start": 0, "end": 16}], "prompt": "Opinoin"} {"content": "This hotel is perfectly located for Fisherman's Wharf", "result_list": [{"text": "perfectly located", "start": 14, "end": 31}], "prompt": "Opinoin"} {"content": "The neighborhood was bad, lots of crime every where.", "result_list": [{"text": "neighborhood ", "start": 4, "end": 17}], "prompt": "Aspect"} {"content": "Location is good", "result_list": [{"text": "Location ", "start": 0, "end": 9}], "prompt": "Aspect"} {"content": "The location is great, its a very convenient area right in the middle of everything.", "result_list": [{"text": "location ", "start": 4, "end": 13}], "prompt": "Aspect"} {"content": "They said all the bed is soft the only way is to add a hard bord under the mattress.", "result_list": [{"text": "bed ", "start": 18, "end": 22}], "prompt": "Aspect"} {"content": "The hotel is very central,", "result_list": [{"text": "hotel ", "start": 4, "end": 10}], "prompt": "Aspect"} {"content": "There were many nice restaurants", "result_list": [{"text": "restaurants", "start": 21, "end": 32}], "prompt": "Aspect"} {"content": "The rooms are VERY small", "result_list": [{"text": "rooms ", "start": 4, "end": 10}], "prompt": "Aspect"} {"content": "Great convenient location with easy access to Chinatown and Fisherman’s Wharf.", "result_list": [{"text": "location ", "start": 17, "end": 26}], "prompt": "Aspect"} {"content": "staff very friendly", "result_list": [{"text": "staff ", "start": 0, "end": 6}], "prompt": "Aspect"} {"content": "Rooms were comfortable", "result_list": [{"text": "Rooms ", "start": 0, "end": 6}], "prompt": "Aspect"} {"content": "beds were great", "result_list": [{"text": "beds ", "start": 0, "end": 5}], "prompt": "Aspect"} {"content": "serves good Italian food and a decent breakfast is reasonable prices.", "result_list": [{"text": "serves good Italian food and a decent breakfast", "start": 0, "end": 47}], "prompt": "Aspect"} {"content": "They said all the bed is soft the only way is to add a hard bord under the mattress.", "result_list": [{"text": "soft ", "start": 25, "end": 30}], "prompt": "Opinoin"} {"content": "liked the eclectic appointed lobby with its charming decor.", "result_list": [{"text": "eclectic appointed", "start": 10, "end": 28}], "prompt": "Opinoin"} {"content": "beds were great", "result_list": [{"text": "great", "start": 10, "end": 15}], "prompt": "Opinoin"} {"content": "Location was excellent", "result_list": [{"text": "Location ", "start": 0, "end": 9}], "prompt": "Aspect"} {"content": "I love this hotel,", "result_list": [{"text": "I", "start": 0, "end": 1}], "prompt": "Aspect"} {"content": "I stayed there", "result_list": [{"text": "stayed there", "start": 2, "end": 14}], "prompt": "Opinoin"} {"content": "The hotel is well located for all the nearby attractions", "result_list": [{"text": "hotel ", "start": 4, "end": 10}], "prompt": "Aspect"} {"content": "Property was very nice, rooms were amazing, staff was friendly.", "result_list": [{"text": "very nice", "start": 13, "end": 22}], "prompt": "Opinoin"} {"content": "Check in was easy", "result_list": [{"text": "Check in", "start": 0, "end": 8}], "prompt": "Aspect"} {"content": "Check in was easy", "result_list": [{"text": "easy", "start": 13, "end": 17}], "prompt": "Opinoin"} {"content": "I got a great rate through hotels.", "result_list": [{"text": "I", "start": 0, "end": 1}], "prompt": "Aspect"} {"content": "The location is perfectly situated for sightseeing, easily to be found and not too far from the airport.", "result_list": [{"text": "perfectly situated", "start": 16, "end": 34}], "prompt": "Opinoin"} {"content": "the breakfast is just a choice", "result_list": [{"text": "a choice", "start": 22, "end": 30}], "prompt": "Opinoin"} {"content": "staff very friendly", "result_list": [{"text": "very friendly", "start": 6, "end": 19}], "prompt": "Opinoin"} {"content": "Rooms were comfortable", "result_list": [{"text": "comfortable", "start": 11, "end": 22}], "prompt": "Opinoin"} {"content": "The hotel close to the transport services.", "result_list": [{"text": "hotel", "start": 4, "end": 9}], "prompt": "Aspect"} {"content": "Breakfast was also was a perfect simple execution with fresh fruit", "result_list": [{"text": "Breakfast ", "start": 0, "end": 10}], "prompt": "Aspect"} {"content": "The rooms are VERY small", "result_list": [{"text": "VERY small", "start": 14, "end": 24}], "prompt": "Opinoin"} {"content": "Hallway is narrow and very low ceiling", "result_list": [{"text": "Hallway ", "start": 0, "end": 8}], "prompt": "Aspect"} {"content": "New shower with marble tile and high shower head for tall people", "result_list": [{"text": "shower ", "start": 4, "end": 11}], "prompt": "Aspect"} {"content": "The room was very good, plenty of room, did not have that cramped in feeling.", "result_list": [{"text": "very good", "start": 13, "end": 22}], "prompt": "Opinoin"} {"content": "The room was very good, plenty of room, did not have that cramped in feeling.", "result_list": [{"text": "room ", "start": 4, "end": 9}], "prompt": "Aspect"} {"content": "There were many nice restaurants", "result_list": [{"text": "nice", "start": 16, "end": 20}], "prompt": "Opinoin"} {"content": "Although this hotel is conveniently located being right at the midst of Chinatown, the room is terribly small.", "result_list": [{"text": "conveniently located", "start": 23, "end": 43}], "prompt": "Opinoin"} {"content": "The staff was so rigid on following policies that they had really poor", "result_list": [{"text": "so rigid", "start": 14, "end": 22}], "prompt": "Opinoin"} {"content": "liked the eclectic appointed lobby with its charming decor.", "result_list": [{"text": "lobby ", "start": 29, "end": 35}], "prompt": "Aspect"} {"content": "The staff was so rigid on following policies that they had really poor", "result_list": [{"text": "staff ", "start": 4, "end": 10}], "prompt": "Aspect"} {"content": "Attentive staff and decent rooms.", "result_list": [{"text": "staff ", "start": 10, "end": 16}], "prompt": "Aspect"} {"content": "The location is great, its a very convenient area right in the middle of everything.", "result_list": [{"text": "great", "start": 16, "end": 21}], "prompt": "Opinoin"} {"content": "Breakfast was very good with enough healthier options", "result_list": [{"text": "Breakfast ", "start": 0, "end": 10}], "prompt": "Aspect"} {"content": "Large televisions with great channel selection.", "result_list": [{"text": "great channel selection", "start": 23, "end": 46}], "prompt": "Opinoin"} {"content": "Restaurant is attached to hotel", "result_list": [{"text": "Restaurant ", "start": 0, "end": 11}], "prompt": "Aspect"} {"content": "Rooms are furnished nicely", "result_list": [{"text": "furnished nicely", "start": 10, "end": 26}], "prompt": "Opinoin"} {"content": "Hallway is narrow and very low ceiling", "result_list": [{"text": "narrow and very low ceiling", "start": 11, "end": 38}], "prompt": "Opinoin"} {"content": "hotel staff were friendly and helpful", "result_list": [{"text": "friendly ", "start": 17, "end": 26}], "prompt": "Opinoin"} {"content": "The hotel is very central,", "result_list": [{"text": "very central", "start": 13, "end": 25}], "prompt": "Opinoin"} {"content": "Rooms are furnished nicely", "result_list": [{"text": "Rooms ", "start": 0, "end": 6}], "prompt": "Aspect"} {"content": "a nice option for breakfast", "result_list": [{"text": "breakfast", "start": 18, "end": 27}], "prompt": "Aspect"} {"content": "The lobby was huge, lots of seating, but not cozy.", "result_list": [{"text": "huge", "start": 14, "end": 18}], "prompt": "Opinoin"} {"content": "The neighborhood was bad, lots of crime every where.", "result_list": [{"text": "bad", "start": 21, "end": 24}], "prompt": "Opinoin"} {"content": "This hotel is perfectly located for Fisherman's Wharf", "result_list": [{"text": "hotel ", "start": 5, "end": 11}], "prompt": "Aspect"} {"content": "I stayed there", "result_list": [{"text": "I", "start": 0, "end": 1}], "prompt": "Aspect"} {"content": "Although this hotel is conveniently located being right at the midst of Chinatown, the room is terribly small.", "result_list": [{"text": "hotel ", "start": 14, "end": 20}], "prompt": "Aspect"} {"content": "the breakfast is just a choice", "result_list": [{"text": "breakfast ", "start": 4, "end": 14}], "prompt": "Aspect"} {"content": "The hotel is well located for all the nearby attractions", "result_list": [{"text": "well located", "start": 13, "end": 25}], "prompt": "Opinoin"} {"content": "I got a great rate through hotels.", "result_list": [{"text": "great", "start": 8, "end": 13}], "prompt": "Opinoin"} {"content": "Location was excellent", "result_list": [{"text": "excellent", "start": 13, "end": 22}], "prompt": "Opinoin"} {"content": "serves good Italian food and a decent breakfast is reasonable prices.", "result_list": [{"text": "reasonable prices", "start": 51, "end": 68}], "prompt": "Opinoin"} {"content": "it's easy to access the city from there.", "result_list": [{"text": "it", "start": 0, "end": 2}], "prompt": "Aspect"} {"content": "Property was very nice, rooms were amazing, staff was friendly.", "result_list": [{"text": "Property ", "start": 0, "end": 9}], "prompt": "Aspect"} {"content": "a nice option for breakfast", "result_list": [{"text": "nice option", "start": 2, "end": 13}], "prompt": "Opinoin"} {"content": "The hotel close to the transport services.", "result_list": [{"text": "close", "start": 10, "end": 15}], "prompt": "Opinoin"} {"content": "The location is perfectly situated for sightseeing, easily to be found and not too far from the airport.", "result_list": [{"text": "location ", "start": 4, "end": 13}], "prompt": "Aspect"} {"content": "Large televisions with great channel selection.", "result_list": [{"text": "televisions ", "start": 6, "end": 18}], "prompt": "Aspect"} {"content": "hotel staff were friendly and helpful", "result_list": [{"text": "staff ", "start": 6, "end": 12}], "prompt": "Aspect"} {"content": "A small cafe is available for breakfast next door", "result_list": [{"text": "small ", "start": 2, "end": 8}], "prompt": "Opinoin"} {"content": "I love this hotel,", "result_list": [{"text": "love this hotel", "start": 2, "end": 17}], "prompt": "Opinoin"} {"content": "Restaurant is attached to hotel", "result_list": [{"text": "attached", "start": 14, "end": 22}], "prompt": "Opinoin"} {"content": "it's easy to access the city from there.", "result_list": [{"text": "easy ", "start": 5, "end": 10}], "prompt": "Opinoin"}

我的最佳F1 是0.34483,目前思路是增加标注的数量再行实验。

使用了多少训练数据?

PanZheng-2021 commented 2 years ago

在用uie-base-en微调的时候,准确率、召回率和F1都很低,大概是0.1到0.2 image image 以下是训练数据集: {"content": "Breakfast was also was a perfect simple execution with fresh fruit", "result_list": [{"text": "perfect ", "start": 25, "end": 33}], "prompt": "Opinoin"} {"content": "Attentive staff and decent rooms.", "result_list": [{"text": "Attentive", "start": 0, "end": 9}], "prompt": "Opinoin"} {"content": "The lobby was huge, lots of seating, but not cozy.", "result_list": [{"text": "lobby", "start": 4, "end": 9}], "prompt": "Aspect"} {"content": "New shower with marble tile and high shower head for tall people", "result_list": [{"text": "New ", "start": 0, "end": 4}], "prompt": "Opinoin"} {"content": "Location is good", "result_list": [{"text": "good", "start": 12, "end": 16}], "prompt": "Opinoin"} {"content": "Breakfast was very good with enough healthier options", "result_list": [{"text": "very good", "start": 14, "end": 23}], "prompt": "Opinoin"} {"content": "A small cafe is available for breakfast next door", "result_list": [{"text": "cafe ", "start": 8, "end": 13}], "prompt": "Aspect"} {"content": "Great convenient location with easy access to Chinatown and Fisherman’s Wharf.", "result_list": [{"text": "Great convenient", "start": 0, "end": 16}], "prompt": "Opinoin"} {"content": "This hotel is perfectly located for Fisherman's Wharf", "result_list": [{"text": "perfectly located", "start": 14, "end": 31}], "prompt": "Opinoin"} {"content": "The neighborhood was bad, lots of crime every where.", "result_list": [{"text": "neighborhood ", "start": 4, "end": 17}], "prompt": "Aspect"} {"content": "Location is good", "result_list": [{"text": "Location ", "start": 0, "end": 9}], "prompt": "Aspect"} {"content": "The location is great, its a very convenient area right in the middle of everything.", "result_list": [{"text": "location ", "start": 4, "end": 13}], "prompt": "Aspect"} {"content": "They said all the bed is soft the only way is to add a hard bord under the mattress.", "result_list": [{"text": "bed ", "start": 18, "end": 22}], "prompt": "Aspect"} {"content": "The hotel is very central,", "result_list": [{"text": "hotel ", "start": 4, "end": 10}], "prompt": "Aspect"} {"content": "There were many nice restaurants", "result_list": [{"text": "restaurants", "start": 21, "end": 32}], "prompt": "Aspect"} {"content": "The rooms are VERY small", "result_list": [{"text": "rooms ", "start": 4, "end": 10}], "prompt": "Aspect"} {"content": "Great convenient location with easy access to Chinatown and Fisherman’s Wharf.", "result_list": [{"text": "location ", "start": 17, "end": 26}], "prompt": "Aspect"} {"content": "staff very friendly", "result_list": [{"text": "staff ", "start": 0, "end": 6}], "prompt": "Aspect"} {"content": "Rooms were comfortable", "result_list": [{"text": "Rooms ", "start": 0, "end": 6}], "prompt": "Aspect"} {"content": "beds were great", "result_list": [{"text": "beds ", "start": 0, "end": 5}], "prompt": "Aspect"} {"content": "serves good Italian food and a decent breakfast is reasonable prices.", "result_list": [{"text": "serves good Italian food and a decent breakfast", "start": 0, "end": 47}], "prompt": "Aspect"} {"content": "They said all the bed is soft the only way is to add a hard bord under the mattress.", "result_list": [{"text": "soft ", "start": 25, "end": 30}], "prompt": "Opinoin"} {"content": "liked the eclectic appointed lobby with its charming decor.", "result_list": [{"text": "eclectic appointed", "start": 10, "end": 28}], "prompt": "Opinoin"} {"content": "beds were great", "result_list": [{"text": "great", "start": 10, "end": 15}], "prompt": "Opinoin"} {"content": "Location was excellent", "result_list": [{"text": "Location ", "start": 0, "end": 9}], "prompt": "Aspect"} {"content": "I love this hotel,", "result_list": [{"text": "I", "start": 0, "end": 1}], "prompt": "Aspect"} {"content": "I stayed there", "result_list": [{"text": "stayed there", "start": 2, "end": 14}], "prompt": "Opinoin"} {"content": "The hotel is well located for all the nearby attractions", "result_list": [{"text": "hotel ", "start": 4, "end": 10}], "prompt": "Aspect"} {"content": "Property was very nice, rooms were amazing, staff was friendly.", "result_list": [{"text": "very nice", "start": 13, "end": 22}], "prompt": "Opinoin"} {"content": "Check in was easy", "result_list": [{"text": "Check in", "start": 0, "end": 8}], "prompt": "Aspect"} {"content": "Check in was easy", "result_list": [{"text": "easy", "start": 13, "end": 17}], "prompt": "Opinoin"} {"content": "I got a great rate through hotels.", "result_list": [{"text": "I", "start": 0, "end": 1}], "prompt": "Aspect"} {"content": "The location is perfectly situated for sightseeing, easily to be found and not too far from the airport.", "result_list": [{"text": "perfectly situated", "start": 16, "end": 34}], "prompt": "Opinoin"} {"content": "the breakfast is just a choice", "result_list": [{"text": "a choice", "start": 22, "end": 30}], "prompt": "Opinoin"} {"content": "staff very friendly", "result_list": [{"text": "very friendly", "start": 6, "end": 19}], "prompt": "Opinoin"} {"content": "Rooms were comfortable", "result_list": [{"text": "comfortable", "start": 11, "end": 22}], "prompt": "Opinoin"} {"content": "The hotel close to the transport services.", "result_list": [{"text": "hotel", "start": 4, "end": 9}], "prompt": "Aspect"} {"content": "Breakfast was also was a perfect simple execution with fresh fruit", "result_list": [{"text": "Breakfast ", "start": 0, "end": 10}], "prompt": "Aspect"} {"content": "The rooms are VERY small", "result_list": [{"text": "VERY small", "start": 14, "end": 24}], "prompt": "Opinoin"} {"content": "Hallway is narrow and very low ceiling", "result_list": [{"text": "Hallway ", "start": 0, "end": 8}], "prompt": "Aspect"} {"content": "New shower with marble tile and high shower head for tall people", "result_list": [{"text": "shower ", "start": 4, "end": 11}], "prompt": "Aspect"} {"content": "The room was very good, plenty of room, did not have that cramped in feeling.", "result_list": [{"text": "very good", "start": 13, "end": 22}], "prompt": "Opinoin"} {"content": "The room was very good, plenty of room, did not have that cramped in feeling.", "result_list": [{"text": "room ", "start": 4, "end": 9}], "prompt": "Aspect"} {"content": "There were many nice restaurants", "result_list": [{"text": "nice", "start": 16, "end": 20}], "prompt": "Opinoin"} {"content": "Although this hotel is conveniently located being right at the midst of Chinatown, the room is terribly small.", "result_list": [{"text": "conveniently located", "start": 23, "end": 43}], "prompt": "Opinoin"} {"content": "The staff was so rigid on following policies that they had really poor", "result_list": [{"text": "so rigid", "start": 14, "end": 22}], "prompt": "Opinoin"} {"content": "liked the eclectic appointed lobby with its charming decor.", "result_list": [{"text": "lobby ", "start": 29, "end": 35}], "prompt": "Aspect"} {"content": "The staff was so rigid on following policies that they had really poor", "result_list": [{"text": "staff ", "start": 4, "end": 10}], "prompt": "Aspect"} {"content": "Attentive staff and decent rooms.", "result_list": [{"text": "staff ", "start": 10, "end": 16}], "prompt": "Aspect"} {"content": "The location is great, its a very convenient area right in the middle of everything.", "result_list": [{"text": "great", "start": 16, "end": 21}], "prompt": "Opinoin"} {"content": "Breakfast was very good with enough healthier options", "result_list": [{"text": "Breakfast ", "start": 0, "end": 10}], "prompt": "Aspect"} {"content": "Large televisions with great channel selection.", "result_list": [{"text": "great channel selection", "start": 23, "end": 46}], "prompt": "Opinoin"} {"content": "Restaurant is attached to hotel", "result_list": [{"text": "Restaurant ", "start": 0, "end": 11}], "prompt": "Aspect"} {"content": "Rooms are furnished nicely", "result_list": [{"text": "furnished nicely", "start": 10, "end": 26}], "prompt": "Opinoin"} {"content": "Hallway is narrow and very low ceiling", "result_list": [{"text": "narrow and very low ceiling", "start": 11, "end": 38}], "prompt": "Opinoin"} {"content": "hotel staff were friendly and helpful", "result_list": [{"text": "friendly ", "start": 17, "end": 26}], "prompt": "Opinoin"} {"content": "The hotel is very central,", "result_list": [{"text": "very central", "start": 13, "end": 25}], "prompt": "Opinoin"} {"content": "Rooms are furnished nicely", "result_list": [{"text": "Rooms ", "start": 0, "end": 6}], "prompt": "Aspect"} {"content": "a nice option for breakfast", "result_list": [{"text": "breakfast", "start": 18, "end": 27}], "prompt": "Aspect"} {"content": "The lobby was huge, lots of seating, but not cozy.", "result_list": [{"text": "huge", "start": 14, "end": 18}], "prompt": "Opinoin"} {"content": "The neighborhood was bad, lots of crime every where.", "result_list": [{"text": "bad", "start": 21, "end": 24}], "prompt": "Opinoin"} {"content": "This hotel is perfectly located for Fisherman's Wharf", "result_list": [{"text": "hotel ", "start": 5, "end": 11}], "prompt": "Aspect"} {"content": "I stayed there", "result_list": [{"text": "I", "start": 0, "end": 1}], "prompt": "Aspect"} {"content": "Although this hotel is conveniently located being right at the midst of Chinatown, the room is terribly small.", "result_list": [{"text": "hotel ", "start": 14, "end": 20}], "prompt": "Aspect"} {"content": "the breakfast is just a choice", "result_list": [{"text": "breakfast ", "start": 4, "end": 14}], "prompt": "Aspect"} {"content": "The hotel is well located for all the nearby attractions", "result_list": [{"text": "well located", "start": 13, "end": 25}], "prompt": "Opinoin"} {"content": "I got a great rate through hotels.", "result_list": [{"text": "great", "start": 8, "end": 13}], "prompt": "Opinoin"} {"content": "Location was excellent", "result_list": [{"text": "excellent", "start": 13, "end": 22}], "prompt": "Opinoin"} {"content": "serves good Italian food and a decent breakfast is reasonable prices.", "result_list": [{"text": "reasonable prices", "start": 51, "end": 68}], "prompt": "Opinoin"} {"content": "it's easy to access the city from there.", "result_list": [{"text": "it", "start": 0, "end": 2}], "prompt": "Aspect"} {"content": "Property was very nice, rooms were amazing, staff was friendly.", "result_list": [{"text": "Property ", "start": 0, "end": 9}], "prompt": "Aspect"} {"content": "a nice option for breakfast", "result_list": [{"text": "nice option", "start": 2, "end": 13}], "prompt": "Opinoin"} {"content": "The hotel close to the transport services.", "result_list": [{"text": "close", "start": 10, "end": 15}], "prompt": "Opinoin"} {"content": "The location is perfectly situated for sightseeing, easily to be found and not too far from the airport.", "result_list": [{"text": "location ", "start": 4, "end": 13}], "prompt": "Aspect"} {"content": "Large televisions with great channel selection.", "result_list": [{"text": "televisions ", "start": 6, "end": 18}], "prompt": "Aspect"} {"content": "hotel staff were friendly and helpful", "result_list": [{"text": "staff ", "start": 6, "end": 12}], "prompt": "Aspect"} {"content": "A small cafe is available for breakfast next door", "result_list": [{"text": "small ", "start": 2, "end": 8}], "prompt": "Opinoin"} {"content": "I love this hotel,", "result_list": [{"text": "love this hotel", "start": 2, "end": 17}], "prompt": "Opinoin"} {"content": "Restaurant is attached to hotel", "result_list": [{"text": "attached", "start": 14, "end": 22}], "prompt": "Opinoin"} {"content": "it's easy to access the city from there.", "result_list": [{"text": "easy ", "start": 5, "end": 10}], "prompt": "Opinoin"}

我的最佳F1 是0.34483,目前思路是增加标注的数量再行实验。

使用了多少训练数据?

大概手动标注的50条

TSAIJK commented 2 years ago

你好,想补充提问一下。对于使用英文进行训练的情况,是否有官方的测试集样例 或者模型的大致表现?想比较一下 看看自己的大致是什么水平。谢谢≧〔゜゜〕≦

coolinstar commented 2 years ago

跟隨這篇調整了代碼,運行 python finetune.py --train_path ./data/train.txt --dev_path ./data/dev.txt --save_dir ./checkpoint --learning_rate 1e-5 --batch_size 16 --max_seq_len 512 --num_epochs 20 --model uie-m-base --seed 1000 --logging_steps 10 --valid_steps 100 --device cpu

報錯

[2022-11-02 16:08:44,537] [ INFO] - We are using <class 'paddlenlp.transformers.ernie_m.tokenizer.ErnieMTokenizer'> to load 'uie-m-base'. Traceback (most recent call last): File "C:\Users\study_zone\paddle\finetune.py", line 177, in do_train() File "C:\Users\study_zone\paddle\finetune.py", line 100, in do_train start_prob, end_prob = model(input_ids, pos_ids) File "C:\Users\anaconda3\envs\padpy39\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in call return self._dygraph_call_func(*inputs, kwargs) File "C:\Users\anaconda3\envs\padpy39\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func outputs = self.forward(*inputs, *kwargs) File "C:\Users\workspace\vs_workspace\study_zone\paddle\model.py", line 55, in forward sequenceoutput, = self.encoder(input_ids=input_ids, File "C:\Users\anaconda3\envs\padpy39\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in call return self._dygraph_call_func(inputs, kwargs) File "C:\Users\anaconda3\envs\padpy39\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func outputs = self.forward(*inputs, kwargs) File "C:\Users\anaconda3\envs\padpy39\lib\site-packages\paddlenlp\transformers\ernie_m\modeling.py", line 312, in forward encoder_outputs = self.encoder( File "C:\Users\anaconda3\envs\padpy39\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in call return self._dygraph_call_func(*inputs, *kwargs) File "C:\Users\anaconda3\envs\padpy39\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func outputs = self.forward(inputs, kwargs) File "C:\Users\anaconda3\envs\padpy39\lib\site-packages\paddlenlp\transformers\model_outputs.py", line 164, in _transformer_encoder_fwd layer_outputs = mod( File "C:\Users\anaconda3\envs\padpy39\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in call return self._dygraph_call_func(*inputs, kwargs) File "C:\Users\anaconda3\envs\padpy39\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func outputs = self.forward(*inputs, *kwargs) File "C:\Users\anaconda3\envs\padpy39\lib\site-packages\paddlenlp\transformers\model_outputs.py", line 70, in _transformer_encoder_layer_fwd attn_outputs = self.self_attn(src, src, src, src_mask, cache) File "C:\Users\anaconda3\envs\padpy39\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in call return self._dygraph_call_func(inputs, kwargs) File "C:\Users\anaconda3\envs\padpy39\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func outputs = self.forward(*inputs, **kwargs) File "C:\Users\anaconda3\envs\padpy39\lib\site-packages\paddle\nn\layer\transformer.py", line 400, in forward q, k, v = self._prepare_qkv(query, key, value, cache) File "C:\Users\anaconda3\envs\padpy39\lib\site-packages\paddle\nn\layer\transformer.py", line 227, in _prepare_qkv q = tensor.reshape(x=q, shape=[0, 0, self.num_heads, self.headdim]) File "C:\Users\anaconda3\envs\padpy39\lib\site-packages\paddle\tensor\manipulation.py", line 2139, in reshape return paddle.fluid.layers.reshape(x=x, shape=shape, name=name) File "C:\Users\anaconda3\envs\padpy39\lib\site-packages\paddle\fluid\layers\nn.py", line 6373, in reshape out, = _C_ops.reshape2(x, None, 'shape', shape) ValueError: (InvalidArgument) The 'shape' in ReshapeOp is invalid. The input tensor X'size must be equal to the capacity of 'shape'. But received X's shape = [16, 768], X's size = 12288, 'shape' is [0, 0, 12, 64], the capacity of 'shape' is 9437184. [Hint: Expected capacity == in_size, but received capacity:9437184 != in_size:12288.] (at C:\home\workspace\Paddle_release\paddle/fluid/operators/reshape_op.cc:204) [operator < reshape2 > error]

是finetune.py 運行時引起? 請問這個有無建議解決方案? 謝謝。

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。

github-actions[bot] commented 1 year ago

This issue was closed because it has been inactive for 14 days since being marked as stale. 当前issue 被标记为stale已有14天,即将关闭。