Open JenkaiMiao opened 1 year ago
I had the same error while running the record_linkage_example
on Python 3.10.6.
The error appears when responding to the second question in the active learning phase.
The Traceback is:
Traceback (most recent call last):
File "/mnt/74225F01225EC82E/Archivos/FIUBA/Trabajo Profesional/dedupe_examples_[exclude]/record_linkage_example/record_linkage_example.py", line 140, in <module>
dedupe.console_label(linker)
File "/mnt/74225F01225EC82E/Archivos/FIUBA/Trabajo Profesional/dedupe_examples_[exclude]/record_linkage_example/venv/lib/python3.10/site-packages/dedupe/convenience.py", line 150, in console_label
unlabeled = deduper.uncertain_pairs()
File "/mnt/74225F01225EC82E/Archivos/FIUBA/Trabajo Profesional/dedupe_examples_[exclude]/record_linkage_example/venv/lib/python3.10/site-packages/dedupe/api.py", line 1168, in uncertain_pairs
return [self.active_learner.pop()]
File "/mnt/74225F01225EC82E/Archivos/FIUBA/Trabajo Profesional/dedupe_examples_[exclude]/record_linkage_example/venv/lib/python3.10/site-packages/dedupe/labeler.py", line 331, in pop
probs = numpy.concatenate(prob_l, axis=1)
File "<__array_function__ internals>", line 180, in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 4998 and the array at index 1 has size 4996
Below are the error messages: Environment: Python 3.7
ValueError Traceback (most recent call last) /tmp/ipykernel_5240/2454607948.py in
35 # print('starting active labeling...')
36
---> 37 dedupe.convenience.console_label(gazetteer)
38
39 gazetteer.train()
~/anaconda3/envs/conda_python37_env/lib/python3.7/site-packages/dedupe/convenience.py in console_label(deduper) 148 try: 149 if not unlabeled: --> 150 unlabeled = deduper.uncertain_pairs() 151 152 record_pair = unlabeled.pop()
~/anaconda3/envs/conda_python37_env/lib/python3.7/site-packages/dedupe/api.py in uncertain_pairs(self) 1166 self.active_learner is not None 1167 ), "Please initialize with the prepare_training method" -> 1168 return [self.active_learner.pop()] 1169 1170 def mark_pairs(self, labeled_pairs: TrainingData) -> None:
~/anaconda3/envs/conda_python37_env/lib/python3.7/site-packages/dedupe/labeler.py in pop(self) 329 330 prob_l = [learner.candidate_scores() for learner in self._learners] --> 331 probs = numpy.concatenate(prob_l, axis=1) 332 333 # where do the classifers disagree?
<__array_function__ internals> in concatenate(*args, **kwargs) ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 4998 and the array at index 1 has size 4996