bigscience-workshop / biomedical

Tools for curating biomedical training data for large-scale language modeling
458 stars 115 forks source link

unittests bug report - mlee #198

Closed sunnnymskang closed 2 years ago

sunnnymskang commented 2 years ago

Describe the bug

Running unit test on mlee causes the import error with utils.py. This might have to do with the relative location of utils and how it's not directly visible from the testes folder - please inspect on this

Steps to reproduce the bug

Expected results

All packages and scripts written are correctly imported

Actual results

Traceback (most recent call last): File "/Users/skang/repo/bigscience/biomedical/tests/test_bigbio.py", line 150, in setUp self._SUPPORTED_TASKS = importlib.import_module(module)._SUPPORTED_TASKS File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1014, in _gcd_import File "", line 991, in _find_and_load File "", line 975, in _find_and_load_unlocked File "", line 671, in _load_unlocked File "", line 783, in exec_module File "", line 219, in _call_with_frames_removed File "/Users/skang/repo/bigscience/biomedical/examples/mlee.py", line 26, in import utils ModuleNotFoundError: No module named 'utils'

sunnnymskang commented 2 years ago

@sg-wbi - for some reason assignee function didn't work above so tagging you here as you were one of the authors of unittest

galtay commented 2 years ago

yea, it looks like mlee is the only example that imports utils. @leonweber I remember you saying something during the meeting that the formats might be so different that utils might be hard to make general. i'm not sure if we were going to try and centralize helper scripts into utils or just have people implement them directly in their dataloader scripts.

leonweber commented 2 years ago

The issue with the unit tests was fixed with #204. I'd argue that we leave utils in place, because there are a lot of data sets around that are distributed in BRAT and for those the parsers in utils will be helpful. Maybe, we come across further useful helpers and can put them into utils during the hackathon.

hakunanatasha commented 2 years ago

Deprecated with #204