There are cases where the output of the LLM is not proper JSON.
We should handle these cases gracefully. Example of such an error below. One possible reason for the error (in this case) would be: a) using double quote " in the input -> messes up the quoting in the output. b) wrong JSON structure .
Attempt 1: force JSON mode (GPT4 and GPT-3.5. Does not work with Anthropic currently)
Attempt 2: ignore errors and just use a different DB row.
Example of the error:
text='Radiodialoge FRF – 04 Themenabend „Heimat, Identität und Migration“ Die interkulturelle Redaktion im Freie Radio Freistadt berichtet über STIMMEN DER VIELFALT 2009, eine Veranstaltung zum Thema „Heimat, Identität und Migration“ im Rahmen des Projektes Radiodialoge 2009 in Freistadt. Hören Sie in dieser Sendung eine Aufzeichnung des Vortrages von Moussa Al Hassen von der Plattform Islam. Er sprach zum Thema „Heimat, Identität und Migration“ aus … Weiterlesen Radiodialoge FRF – 04 Themenabend „Heimat, Identität und Migration“\n Die interkulturelle Redaktion im Freie Radio Freistadt berichtet über STIMMEN DER VIELFALT 2009, eine Veranstaltung zum Thema „Heimat, Identität und Migration“ im Rahmen des Projektes Radiodialoge 2009 in Freistadt.\nHören Sie in dieser Sendung eine Aufzeichnung des Vortrages von Moussa Al Hassen von der Plattform Islam. Er sprach zum Thema „Heimat, Identität und Migration“ aus der Sicht der Muslime.\nIm 2. Teil der Sendung bringen wir Beiträge von einer Podiumsdiskussion die von den Kinderfreunden Mühlviertel, am 22. Oktober in der Musikhauptschule Freistadt veranstaltet wurde.\nDie Wortbeiträge kommen von Elif Yilmaz (freie Mitarbeiterin der Integrationsstelle OÖ. und am Institut Interkulturelle Pädagogik/VHS OÖ.), Ulrike Steininger (Direktorin der VS 1 und Vizebürgermeisterin von Freistadt) und Maga. Andrea Wahl, Geschäftsführung Kinderfreunde Mühlviertel).\nSendungsgestaltung:\nAlmut Zillner, Harald Freudenthaler, Albert Heidlmeir\n—————————————————————————————-\nZur Veranstaltung STIMMEN DER VIELFALT 2009\nAm Freitag 30. Oktober veranstaltete das Freie Radio Freistadt in Kooperation mit der Local-Bühne Freistadt, Integrationsbüro der Volkshilfe Freistadt und dem Verein Cafe Mulatschag einen Themenabend zu „Heimat, Identität und Migration“. Ein Beitrag für ein besseres Miteinander zwischen inländischen und zugewanderten Mitbürgerinnen und Mitbürgern in Freistadt.\nVORTRÄGE\nGeboten wurden Expertenvorträge von Moussa Al Hassan (Plattform Islam) und Burkhard Landwehr (Berater und Mediator bei kulturbedingten Werte- und Normenkollisionen).\nTHEATER\nIm Rahmen der Veranstaltung wurde auch das Theaterstück da.Heim.AT.los des Wiener Theatervereins Cocon http://www.cocon-kultur.com an 2 Tagen aufgeführt. Wir haben schon darüber berichtet – eine Sendung mit AkteurInnen von cocon können Sie unter folgendem Link nachhören: https://cba.media/14617\nAuch das halbdokumentarische Theaterstück kreist um die Themen Heimat, Identität und Migration. Begriffe, die stets für politische Auseinandersetzungen und heftige Emotionen sorgen. Neun KünstlerInnen aus den Bundesländern Österreichs brachten individuelle und gesellschaftliche Vorstellungen von „Heimat“ auf die Bühne.\nSCHWERPUNKTPROGRAMM IM FREIEN RADIO FREISTADT\nAnläßlich von Stimmen der Vielfalt 2009 sendete das Freie Radio Freistadt täglich drei Stunden Schwerpunktprogramm zum Thema interkultureller Dialog.\n '
50%|██████████████████████████████████████████████████████████████████████ | 5/10 [00:40<00:40, 8.01s/it]
Traceback (most recent call last):
File "/home/aaron/git/work/CBA/cba_llm/venv/lib/python3.10/site-packages/langchain_core/output_parsers/json.py", line 212, in parse_result
return parse_json_markdown(text)
File "/home/aaron/git/work/CBA/cba_llm/venv/lib/python3.10/site-packages/langchain_core/output_parsers/json.py", line 157, in parse_json_markdown
parsed = parser(json_str)
File "/home/aaron/git/work/CBA/cba_llm/venv/lib/python3.10/site-packages/langchain_core/output_parsers/json.py", line 125, in parse_partial_json
return json.loads(s, strict=strict)
File "/usr/lib/python3.10/json/__init__.py", line 359, in loads
return cls(**kw).decode(s)
File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.10/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ':' delimiter: line 4 column 895 (char 1075)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/aaron/git/work/CBA/cba_llm/app/db.py", line 291, in <module>
translated_text = translate(src_text = text, dst_language=dst_language, _src_language=src_language)
File "/home/aaron/git/work/CBA/cba_llm/app/translation.py", line 64, in translate
result = chain.invoke(data)
File "/home/aaron/git/work/CBA/cba_llm/venv/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 2446, in invoke
input = step.invoke(
File "/home/aaron/git/work/CBA/cba_llm/venv/lib/python3.10/site-packages/langchain_core/output_parsers/base.py", line 169, in invoke
return self._call_with_config(
File "/home/aaron/git/work/CBA/cba_llm/venv/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 1625, in _call_with_config
context.run(
File "/home/aaron/git/work/CBA/cba_llm/venv/lib/python3.10/site-packages/langchain_core/runnables/config.py", line 347, in call_func_with_variable_args
return func(input, **kwargs) # type: ignore[call-arg]
File "/home/aaron/git/work/CBA/cba_llm/venv/lib/python3.10/site-packages/langchain_core/output_parsers/base.py", line 170, in <lambda>
lambda inner_input: self.parse_result(
File "/home/aaron/git/work/CBA/cba_llm/venv/lib/python3.10/site-packages/langchain_core/output_parsers/json.py", line 215, in parse_result
raise OutputParserException(msg, llm_output=text) from e
langchain_core.exceptions.OutputParserException: Invalid json output: {
"translated_text": {
"Radiodialogues FRF - 04 Theme Evening 'Home, Identity, and Migration'": "Radiodialogues FRF - 04 Theme Evening 'Home, Identity, and Migration'",
"The intercultural editorial team at Freie Radio Freistadt reports on VOICES OF DIVERSITY 2009, an event on 'Home, Identity, and Migration' as part of the Radiodialogues 2009 project in Freistadt. In this broadcast, listen to a recording of the lecture by Moussa Al Hassen from the Platform Islam. He spoke on the topic of 'Home, Identity, and Migration' from the perspective of Muslims. In the second part of the broadcast, we present contributions from a panel discussion organized by the Children's Friends Mühlviertel on October 22 at the Music Secondary School in Freistadt. The contributions come from Elif Yilmaz (freelancer at the Integration Office Upper Austria and at the Institute for Intercultural Education/VHS Upper Austria), Ulrike Steininger (Principal of VS 1 and Deputy Mayor of Freistadt), and Maga. Andrea Wahl, Managing Director of Children's Friends Mühlviertel.",
"Broadcast Design": "Broadcast Design",
"Almut Zillner, Harald Freudenthaler, Albert Heidlmeir": "Almut Zillner, Harald Freudenthaler, Albert Heidlmeir",
"About the event VOICES OF DIVERSITY 2009": "About the event VOICES OF DIVERSITY 2009",
"On Friday, October 30, Freie Radio Freistadt, in cooperation with Local-Bühne Freistadt, Integration Office of Volkshilfe Freistadt, and the association Cafe Mulatschag, organized an evening on 'Home, Identity, and Migration'. A contribution for better coexistence between local and immigrant fellow citizens in Freistadt.": "On Friday, October 30, Freie Radio Freistadt, in cooperation with Local-Bühne Freistadt, Integration Office of Volkshilfe Freistadt, and the association Cafe Mulatschag, organized an evening on 'Home, Identity, and Migration'. A contribution for better coexistence between local and immigrant fellow citizens in Freistadt.",
"LECTURES": "LECTURES",
"Expert lectures were given by Moussa Al Hassan (Platform Islam) and Burkhard Landwehr (Consultant and Mediator in culture-related value and norm collisions).": "Expert lectures were given by Moussa Al Hassan (Platform Islam) and Burkhard Landwehr (Consultant and Mediator in culture-related value and norm collisions).",
"THEATER": "THEATER",
"As part of the event, the play da.Heim.AT.los by the Vienna Theater Association Cocon http://www.cocon-kultur.com was also performed for 2 days. We have already reported on it - you can listen to a broadcast with actors from cocon at the following link: https://cba.media/14617": "As part of the event, the play da.Heim.AT.los by the Vienna Theater Association Cocon http://www.cocon-kultur.com was also performed for 2 days. We have already reported on it - you can listen to a broadcast with actors from cocon at the following link: https://cba.media/14617",
"The semi-documentary play also revolves around the themes of Home, Identity, and Migration. Terms that always lead to political debates and strong emotions. Nine artists from the federal states of Austria brought individual and societal ideas of 'Home' to the stage.": "The semi-documentary play also revolves around the themes of Home, Identity, and Migration. Terms that always lead to political debates and strong emotions. Nine artists from the federal states of Austria brought individual and societal ideas of 'Home' to the stage.",
"FOCUS PROGRAM IN FREIE RADIO FREISTADT": "FOCUS PROGRAM IN FREIE RADIO FREISTADT",
"On the occasion of Voices of Diversity 2009, Freie Radio Freistadt broadcasted three hours of focus program daily on the topic of intercultural dialogue.": "On the occasion of Voices of Diversity 2009, Freie Radio Freistadt broadcasted three hours of focus program daily on the topic of intercultural dialogue."
},
"src_language": "de"
}
There are cases where the output of the LLM is not proper JSON. We should handle these cases gracefully. Example of such an error below. One possible reason for the error (in this case) would be: a) using double quote " in the input -> messes up the quoting in the output. b) wrong JSON structure .
Attempt 1: force JSON mode (GPT4 and GPT-3.5. Does not work with Anthropic currently)
Attempt 2: ignore errors and just use a different DB row.
Example of the error: