LUH-DBS / Matelda

Apache License 2.0
0 stars 0 forks source link

Error during column grouping #3

Closed MarcSpeckmann closed 2 years ago

MarcSpeckmann commented 2 years ago
Traceback (most recent call last):
  File "/Users/marc/Documents/Projects/DBS_HiWi/ED-Scale/end-to-end-eds.py", line 107, in <module>
    run_experiments(sandbox_dir, output_dir, exp_name, True, True, True, number_of_labels, True)
  File "/Users/marc/Documents/Projects/DBS_HiWi/ED-Scale/end-to-end-eds.py", line 52, in run_experiments
    number_of_column_clusters = cols_grouping.col_folding(table_grouping_output, sandbox_path, labels_path,
  File "/Users/marc/Documents/Projects/DBS_HiWi/ED-Scale/cols_grouping.py", line 200, in col_folding
    col_labels_df, number_of_clusters = cluster_cols_auto(col_features, auto_clustering_enabled)
  File "/Users/marc/Documents/Projects/DBS_HiWi/ED-Scale/cols_grouping.py", line 157, in cluster_cols_auto
    clustering_results = DBSCAN(eps=0.5, min_samples=2).fit(reduced_features)
  File "/usr/local/Caskroom/miniconda/base/envs/DBS_Hiwi/lib/python3.10/site-packages/sklearn/cluster/_dbscan.py", line 406, in fit
    neighborhoods = neighbors_model.radius_neighbors(X, return_distance=False)
  File "/usr/local/Caskroom/miniconda/base/envs/DBS_Hiwi/lib/python3.10/site-packages/sklearn/neighbors/_base.py", line 1097, in radius_neighbors
    results = PairwiseDistancesRadiusNeighborhood.compute(
  File "sklearn/metrics/_pairwise_distances_reduction.pyx", line 1346, in sklearn.metrics._pairwise_distances_reduction.PairwiseDistancesRadiusNeighborhood.compute
  File "/usr/local/Caskroom/miniconda/base/envs/DBS_Hiwi/lib/python3.10/site-packages/sklearn/utils/fixes.py", line 151, in threadpool_limits
    return threadpoolctl.threadpool_limits(limits=limits, user_api=user_api)
  File "/usr/local/Caskroom/miniconda/base/envs/DBS_Hiwi/lib/python3.10/site-packages/threadpoolctl.py", line 171, in __init__
    self._original_info = self._set_threadpool_limits()
  File "/usr/local/Caskroom/miniconda/base/envs/DBS_Hiwi/lib/python3.10/site-packages/threadpoolctl.py", line 268, in _set_threadpool_limits
    modules = _ThreadpoolInfo(prefixes=self._prefixes,
  File "/usr/local/Caskroom/miniconda/base/envs/DBS_Hiwi/lib/python3.10/site-packages/threadpoolctl.py", line 340, in __init__
    self._load_modules()
  File "/usr/local/Caskroom/miniconda/base/envs/DBS_Hiwi/lib/python3.10/site-packages/threadpoolctl.py", line 371, in _load_modules
    self._find_modules_with_dyld()
  File "/usr/local/Caskroom/miniconda/base/envs/DBS_Hiwi/lib/python3.10/site-packages/threadpoolctl.py", line 428, in _find_modules_with_dyld
    self._make_module_from_path(filepath)
  File "/usr/local/Caskroom/miniconda/base/envs/DBS_Hiwi/lib/python3.10/site-packages/threadpoolctl.py", line 515, in _make_module_from_path
    module = module_class(filepath, prefix, user_api, internal_api)
  File "/usr/local/Caskroom/miniconda/base/envs/DBS_Hiwi/lib/python3.10/site-packages/threadpoolctl.py", line 606, in __init__
    self.version = self.get_version()
  File "/usr/local/Caskroom/miniconda/base/envs/DBS_Hiwi/lib/python3.10/site-packages/threadpoolctl.py", line 646, in get_version
    config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
FatemehAhmadi94 commented 2 years ago

@XxHalbfettxX Fixed?

MarcSpeckmann commented 2 years ago

The problem still occurs with me. Both with sequential and parallel execution. Maybe the packages we use have different versions, and for me the versions do not get along. Maybe it helps if the versions of the packages are defined in a requierments.txt or conda env file.

FatemehAhmadi94 commented 2 years ago

@XxHalbfettxX

The problem still occurs with me. Both with sequential and parallel execution. Maybe the packages we use have different versions, and for me the versions do not get along. Maybe it helps if the versions of the packages are defined in a requierments.txt or conda env file.

I put the Conda environment in the directory. You can try that.