Tseng-Yi-Chung commented 3 years ago

大大，請問一下，我沒有看到您的程式當中有fine tune的程式碼，只有看到您把資料傳到模型裡面，並不理解這過程當中是如何fine-tune的，可否告知，感謝。

p208p2002 commented 3 years ago

您好，repo中的確不包含fine tune程式碼主要原因是因為huggingface已經有提供範例了，因此就不再重複造輪子 https://github.com/huggingface/transformers/tree/master/examples/question-answering 可以簡單的將你的資料集整理成SQuAD就可以fine-tune模型

另外如果只是要簡單的落地應用，推薦看一下piplines，現在huggingface都包的很好 https://github.com/huggingface/transformers/blob/master/notebooks/03-pipelines.ipynb

如果有不瞭解或是想深入討論可以再寄信問我

Philip

Tseng-Yi-Chung notifications@github.com 於 2020年12月17日週四下午4:55寫道：

大大，請問一下，我沒有看到您的程式當中有fine tune的程式碼，只有看到您把資料傳到模型裡面，並不理解這過程當中是如何fine-tune的，可否告知，感謝。

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/p208p2002/bert-question-answer/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD7RWBUZWLXVMVVY7WWWW4DSVHBOZANCNFSM4U7IX5UQ .

p208p2002 commented 3 years ago

如果是想從頭建立fine tune可以參考我的另外一個repo 雖然這個repo實際上做的是句子分類，但是還是可以幫助你了解如何進行fine tune https://github.com/p208p2002/taipei-QA-BERT

Wiwi30795 commented 2 years ago

您好，不好意思想請教一下，huggingface在QA這邊 : https://huggingface.co/docs/transformers/tasks/question_answering 使用的是 DistilBERT ( https://huggingface.co/distilbert-base-uncased )，如果將其更換成Bert-base-chinese是否可行呢?

p208p2002 commented 2 years ago

A1. 可以參考一下這邊 https://github.com/huggingface/transformers/tree/main/examples/pytorch/question-answering

A2. 應該是沒問題的，兩個模型的架構與special token是相同的

Wiwi30795 commented 2 years ago

好的! 感謝您的回覆~

XiaoZhong77 commented 2 years ago

不好意思我想問一下在pip install -r requirements.txt的時候它出現了ERROR: Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt' 請問該如何解決

p208p2002 commented 2 years ago

@XiaoZhong77 抱歉檔案命名錯誤，剛剛已經修正了重新pull 或 clone後再試一次即可

XiaoZhong77 commented 2 years ago

謝謝您的答覆~~

XiaoZhong77 commented 2 years ago

不好意思大大我想再問一下當執行的結果出現 AttributeError: 'str' object has no attribute 'detach' 請問該如何解決呢?

p208p2002 commented 2 years ago

方便的話麻煩附上能呈現錯誤的最小程式碼以及

python version
transformer version (package)
torch version (package)

XiaoZhong77 commented 2 years ago

init BertQA

bertQA = BertQA(model = model, tokenizer = tokenizer, device = device) context="大同國小有三個職員，王大明是校長，張小美是秘書，陳小玉是總務長"

question = "誰是校長" answer_results = bertQA.ask(context,question)

score:2.17795 start_index:11(1.07034) end_index:13(1.10761) answer:王大明

question = "陳小玉的工作是什麼" answer_results = bertQA.ask(context,question)

score:2.07151 start_index:29(1.84568) end_index:31(0.22583) answer:總務長

我是把您放在外面的這段程式碼放進example.py使用但不知道這樣是不是正確的使用方式

python version 3.10.8 64-bit transformer version 22.2.2 torch version 1.12.1+cpu

p208p2002 commented 2 years ago

由於這個專案目前許久沒有維護請使用transformers==2.5.1 我剛剛測試是沒問題的

另外我剛剛也稍微更新了一下算分方式原本是直接使用 start_logit+end_logit的方式不過使用start_prob*end_prob才是更好的

p208p2002 commented 2 years ago

另外現在transformers有推出pipeline會是更好的使用QA模型方式，強烈推薦看一下這個專案主要是用一個最小化示範的方式來理解answer span是怎麼被pick出來的 https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.QuestionAnsweringPipeline 有任何問題也歡迎提出討論

XiaoZhong77 commented 2 years ago

好我會再試試再次感謝大大的回覆~

p208p2002 / bert-question-answer

請教您關於fine tune的問題 #1

init BertQA

score:2.17795 start_index:11(1.07034) end_index:13(1.10761) answer:王大明

score:2.07151 start_index:29(1.84568) end_index:31(0.22583) answer:總務長