Dec 14 22:04:22: WARNING:root:Could not process gazette: 2700000/2023-12-14/ab3183d921f7bb119a6a253e6dc59dc6fb07a367.pdf. Cause: 'Couldn\'t find info for "albarrasaomiguelal"'
Dec 14 22:04:22: ERROR:root:'Couldn\'t find info for "albarrasaomiguelal"'
Dec 14 22:04:22: Traceback (most recent call last):
Dec 14 22:04:22: File "/mnt/code/tasks/gazette_text_extraction.py", line 32, in extract_text_from_gazettes
Dec 14 22:04:22: document_ids = try_process_gazette_file(
Dec 14 22:04:22: File "/mnt/code/tasks/gazette_text_extraction.py", line 69, in try_process_gazette_file
Dec 14 22:04:22: territory_segments = segmenter.get_gazette_segments(gazette)
Dec 14 22:04:22: File "/mnt/code/segmentation/segmenters/al_associacao_municipios.py", line 24, in get_gazette_segments
Dec 14 22:04:22: gazette_segments = [
Dec 14 22:04:22: File "/mnt/code/segmentation/segmenters/al_associacao_municipios.py", line 25, in <listcomp>
Dec 14 22:04:22: self.build_segment(territory_slug, segment_text, gazette).__dict__
Dec 14 22:04:22: File "/mnt/code/segmentation/segmenters/al_associacao_municipios.py", line 65, in build_segment
Dec 14 22:04:22: territory_data = get_territory_data(territory_slug, self.territories)
Dec 14 22:04:22: File "/mnt/code/tasks/utils/territories.py", line 28, in get_territory_data
Dec 14 22:04:22: raise KeyError(f"Couldn't find info for \"{territory_slug}\"")
Dec 14 22:04:22: KeyError: 'Couldn\'t find info for "albarrasaomiguelal"'
Provavelmente seria suficiente alterar o _normalize_territory_name() do segmentador e incluir esse caso:
Logs do erro:
Provavelmente seria suficiente alterar o
_normalize_territory_name()
do segmentador e incluir esse caso: