Closed leosouliotis closed 3 years ago
Hi @leosouliotis, looks like a circular import issue.
I see from the error message that in line 8 of lrtc_lib/data_access/data_access_factory.py
you have an import that was not there in our code, and may be the direct cause of the circular import situation. Do you need this import there (from lrtc_lib.data_access.processors.process_csv_data import CsvProcessor)
?
Thanks for the quick response @arielge ! I did delete the this entry from lrtc_lib/data_access/data_access_factory.py
(which was left from my previous experimentation, silly me) but then got this error:
Traceback (most recent call last):
File "/home/kpvv542/Projects/low-resource-text-classification-framework/lrtc_lib/data_access/data_access_factory.py", line
(kpvv542) (lrtc_env) [kpvv542@seskscpn080 low-resource-text-classification-framework]$ python -m lrtc_lib.data.load_datasetTraceback (most recent call last):
File "/opt/scp/software/Miniconda3/4.7.12.1/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/opt/scp/software/Miniconda3/4.7.12.1/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/kpvv542/Projects/low-resource-text-classification-framework/lrtc_lib/data/load_dataset.py", line 8, in <module>
from lrtc_lib.data_access import single_dataset_loader
File "/home/kpvv542/Projects/low-resource-text-classification-framework/lrtc_lib/data_access/single_dataset_loader.py", line 13, in <module>
import lrtc_lib.data_access.data_access_factory as data_access_factory
File "/home/kpvv542/Projects/low-resource-text-classification-framework/lrtc_lib/data_access/data_access_factory.py", line 8, in <module>
from lrtc_lib.data_access.processors.data_processor_api import DataProcessorAPI
(kpvv542) (lrtc_env) [kpvv542@seskscpn080 low-resource-text-classification-framework]$ python -m lrtc_lib.data.load_dataset
Traceback (most recent call last):
File "/opt/scp/software/Miniconda3/4.7.12.1/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/opt/scp/software/Miniconda3/4.7.12.1/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/kpvv542/Projects/low-resource-text-classification-framework/lrtc_lib/data/load_dataset.py", line 8, in <module>
from lrtc_lib.data_access import single_dataset_loader
File "/home/kpvv542/Projects/low-resource-text-classification-framework/lrtc_lib/data_access/single_dataset_loader.py", line 13, in <module>
import lrtc_lib.data_access.data_access_factory as data_access_factory
File "/home/kpvv542/Projects/low-resource-text-classification-framework/lrtc_lib/data_access/data_access_factory.py", line 8, in <module>
from lrtc_lib.data_access.processors.data_processor_api import DataProcessorAPI
File "/home/kpvv542/Projects/low-resource-text-classification-framework/lrtc_lib/data_access/processors/data_processor_api.py", line 10, in <module>
import lrtc_lib.orchestrator.orchestrator_api as orchestrator_api
File "/home/kpvv542/Projects/low-resource-text-classification-framework/lrtc_lib/orchestrator/orchestrator_api.py", line 23, in <module>
from lrtc_lib.training_set_selector import training_set_selector_factory
File "/home/kpvv542/Projects/low-resource-text-classification-framework/lrtc_lib/training_set_selector/training_set_selector_factory.py", line 6, in <module>
from lrtc_lib.training_set_selector.train_and_dev_sets_selectors import TrainAndDevSetsSelectorAllLabeled, TrainAndDevSetsSelectorAllLabeledPlusUnlabeledAsWeakNegative
File "/home/kpvv542/Projects/low-resource-text-classification-framework/lrtc_lib/training_set_selector/train_and_dev_sets_selectors.py", line 14, in <module>
data_access = data_access_factory.get_data_access()
AttributeError: module 'lrtc_lib.data_access.data_access_factory' has no attribute 'get_data_access'
It seems to be the same error but in a different way... (?)
Indeed looks like more of the same. Can you try restoring lrtc_lib/data_access/data_access_factory.py
to the committed version? You should see only two imports there (DataAccessApi
and DataAccessInMemory
)
Thanks for the suggestion! Now we got a different error.
2021-06-30 14:03:51.615136: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory
2021-06-30 14:03:51.615318: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory
2021-06-30 14:03:51.615330: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
Traceback (most recent call last):
File "/opt/scp/software/Miniconda3/4.7.12.1/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
(kpvv542) (lrtc_env) [kpvv542@seskscpn080 low-resource-text-classification-framework]$ python -m lrtc_lib.data.load_dataset
2021-06-30 14:05:47.231191: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory
2021-06-30 14:05:47.231392: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory
2021-06-30 14:05:47.231406: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
Traceback (most recent call last):
File "/opt/scp/software/Miniconda3/4.7.12.1/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/opt/scp/software/Miniconda3/4.7.12.1/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/kpvv542/Projects/low-resource-text-classification-framework/lrtc_lib/data/load_dataset.py", line 31, in <module>
load(dataset=dataset_name)
File "/home/kpvv542/Projects/low-resource-text-classification-framework/lrtc_lib/data/load_dataset.py", line 21, in load
single_dataset_loader.load_dataset(dataset_name, force_new)
File "/home/kpvv542/Projects/low-resource-text-classification-framework/lrtc_lib/data_access/single_dataset_loader.py", line 38, in load_dataset
data_processor: DataProcessorAPI = processor_factory.get_data_processor(dataset_name)
File "/home/kpvv542/Projects/low-resource-text-classification-framework/lrtc_lib/data_access/processors/data_processor_factory.py", line 25, in get_data_processor
return TrialTroveProcessor(dataset_part=dataset_part)
TypeError: Can't instantiate abstract class TrialTroveProcessor with abstract methods _get_dev_file_path, _get_test_file_path, _get_train_file_path
But let me explain further: I have created a processor in the following way:
from lrtc_lib.data_access.processors.dataset_part import DatasetPart
from lrtc_lib.data_access.processors.data_processor_api import DataProcessorAPI
class TrialTroveProcessor(DataProcessorAPI):
def __init__(self, dataset_part: DatasetPart, label_col: str = 'target'):
super().__init__(dataset_name='trialtrove', dataset_part=dataset_part)
and to data_processor_factory.py
I have added the following:
from lrtc_lib.data_access.processors.process_trialtrove import TrialTroveProcessor
if dataset_source == 'trialtrove':
return TrialTroveProcessor(dataset_part=dataset_part)
I see. Your TrialTroveProcessor
inherits from DataProcessorAPI
, which has some abstract methods (see https://docs.python.org/3/library/abc.html). This means if you inherit from it you must override these methods, namely _get_train_file_path
, _get_dev_file_path
, _get_test_file_path
.
Thanks for your time and effort @arielge!
Changing to CsvProcessor
rather than DataProcessorAPI
solved the issue! Maybe worth pointing out the extra steps needed if someoine wants to implement the full DataProcessorAPI
?
Feel free to close this issue, thanks again!
Hello,
I am trying to implement your AL strategy on a custom dataset. I followed all the steps (with the minimum entry for the minimum steps for the CSV processor) and when I try to run the
load_dataset
script I get the following:Any suggestions? I don't think these should be something wrong with the dataset