microsoft / generative-ai-for-beginners

18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
https://microsoft.github.io/generative-ai-for-beginners/
MIT License
62.1k stars 31.76k forks source link

chapter08: improvements for the transcript_enrich*.py scripts #599

Open bmerkle opened 2 hours ago

bmerkle commented 2 hours ago

Describe the bug before tackling #591 I would like to fix a few bugs in the transcript_enrich*.py scripts. This is a issue for all the 5 scripts which are present and have some minor defects. Once this is fixed and they run clean, I would propose that the update to openai 1.x port in #591 happens

transcript_enrich_speaker.py

transcript_enrich_bucket.py

transcript_enrich_summaries.py

transcript_enrich_embeddings.py

transcript_enrich_lite.py

To Reproduce

Steps to reproduce the behavior:

transcript_enrich_speaker.py

python transcript_enrich_speaker.py -f %TRANSCRIPT_FOLDER%

(venv) C:\work\microsoft\generative-ai-for-beginners\08-building-search-applications\scripts>python transcript_enrich_speaker.py -f %TRANSCRIPT_FOLDER% Exception in thread Thread-2 (process_queue): Traceback (most recent call last): File "C:\work\microsoft\generative-ai-for-beginners\venv\Lib\site-packages\tenacity__init.py", line 478, in call__ result = fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "C:\work\microsoft\generative-ai-for-beginners\08-building-search-applications\scripts\transcript_enrich_speaker.py", line 138, in get_speaker_info arguments = json.loads(result.get("function_call").get("arguments")) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Program Files\Python311\Lib\json__init__.py", line 346, in loads return _default_decoder.decode(s) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Program Files\Python311\Lib\json\decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Program Files\Python311\Lib\json\decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\Program Files\Python311\Lib\threading.py", line 1045, in _bootstrap_inner self.run() File "C:\Program Files\Python311\Lib\threading.py", line 982, in run self._target(*self._args, self._kwargs) File "C:\work\microsoft\generative-ai-for-beginners\08-building-search-applications\scripts\transcript_enrich_speaker.py", line 200, in process_queue function_name, arguments = get_speaker_info(base_text) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\work\microsoft\generative-ai-for-beginners\venv\Lib\site-packages\tenacity__init.py", line 336, in wrapped_f return copy(f, *args, **kw) ^^^^^^^^^^^^^^^^^^^^ File "C:\work\microsoft\generative-ai-for-beginners\venv\Lib\site-packages\tenacity__init.py", line 475, in call do = self.iter(retry_state=retry_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\work\microsoft\generative-ai-for-beginners\venv\Lib\site-packages\tenacity__init__.py", line 376, in iter result = action(retry_state) ^^^^^^^^^^^^^^^^^^^ File "C:\work\microsoft\generative-ai-for-beginners\venv\Lib\site-packages\tenacity\init__.py", line 419, in exc_check raise retry_exc from fut.exception() tenacity.RetryError: RetryError[<Future at 0x126c59b2dd0 state=finished raised JSONDecodeError>] Exception in thread Thread-4 (process_queue): Traceback (most recent call last): File "C:\work\microsoft\generative-ai-for-beginners\venv\Lib\site-packages\tenacity\init.py", line 478, in call__ result = fn(*args, kwargs) ^^^^^^^^^^^^^^^^^^^

the PR will fix these errors.

transcript_enrich_bucket.py

transcript_enrich_summaries.py

transcript_enrich_embeddings.py

transcript_enrich_lite.py

Expected behavior A clear and concise description of what you expected to happen.

github-actions[bot] commented 2 hours ago

👋 Thanks for contributing @bmerkle! We will review the issue and get back to you soon.