Closed RKorzeniowski closed 3 years ago
Yah if you want to make a PR go for it.
The project is built of on nbdev and so the process for developing and submitting PRs is the same as for libraries like fastai. See https://docs.fast.ai/dev-setup.
In particular, make sure you run nbdev_install_git_hooks
right after you
git clone the library. If you want to add some tests that would be great
too. Check out the nbdev docs for how to do that and work on any project
based on it: https://nbdev.fast.ai/.
Thanks and lmk if you have any questions.
-wg
On Sun, Nov 8, 2020 at 12:59 AM RKorzeniowski notifications@github.com wrote:
Hi, very cool lib. Just wanted to say that pre_process_squad https://github.com/ohmeow/blurr/blob/master/blurr/data/question_answering.py function is not working correctly when following docs https://ohmeow.github.io/blurr/modeling-question-answering/. There are two problems when huggingface datasets (updated nlp package) is used like that nlp.load_dataset('squad_v2') https://huggingface.co/docs/datasets/package_reference/loading_methods.html .
- column names differ, to be exact "anwsers" and "anwser_text".
- answers are given in dict(list(str)) format and tokenization that sets end and start token targets works as if it was dict(str). This ends up setting all targets as (0,0). I had to fix that for my usecase so if you want I can make a PR with fixes. Let me know if there are things that I should do before like running tests
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ohmeow/blurr/issues/19, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAADNMAON377IMLFRDBSPQTSOZMVNANCNFSM4TOGA72A .
I think this is fixed now so I'm closing it out. If you're still seeing issues, feel free to reopen.
Hi, very cool lib. Just wanted to say that
pre_process_squad
function is not working correctly when following docs. There are two problems when nlp package is used like thatnlp.load_dataset('squad_v2')
.I had to fix that for my usecase so if you want I can make a PR with fixes. Let me know if there are things that I should do before like running tests