after I run the test.py with all proper settings, I came across the error issue below and I can't fixt it even I have already add UTF-8 related code section into the test.py file.
Traceback (most recent call last):
File "C:\Users\zhang\GPT-Projects\FreeSideProjects\gptpdf\test\test.py", line 38, in
test_use_api_key()
File "C:\Users\zhang\GPT-Projects\FreeSideProjects\gptpdf\test\test.py", line 22, in test_use_api_key
content, image_paths = parse_pdf(pdf_path, output_dir=output_dir, api_key=api_key, base_url=base_url, model='gpt-4o', gpt_worker=6)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\zhang\GPT-Projects\FreeSideProjects\gptpdf\venv\Lib\site-packages\gptpdf\parse.py", line 294, in parse_pdf
content = _gpt_parse_images(image_infos, output_dir=output_dir, api_key=api_key, base_url=base_url, model=model, verbose=verbose, gpt_worker=gpt_worker)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\zhang\GPT-Projects\FreeSideProjects\gptpdf\venv\Lib\site-packages\gptpdf\parse.py", line 261, in _gpt_parse_images
f.write('\n\n'.join(contents))
UnicodeEncodeError: 'gbk' codec can't encode character '\u2020' in position 346: illegal multibyte sequence
after I run the test.py with all proper settings, I came across the error issue below and I can't fixt it even I have already add UTF-8 related code section into the test.py file.
Traceback (most recent call last): File "C:\Users\zhang\GPT-Projects\FreeSideProjects\gptpdf\test\test.py", line 38, in
test_use_api_key()
File "C:\Users\zhang\GPT-Projects\FreeSideProjects\gptpdf\test\test.py", line 22, in test_use_api_key
content, image_paths = parse_pdf(pdf_path, output_dir=output_dir, api_key=api_key, base_url=base_url, model='gpt-4o', gpt_worker=6)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\zhang\GPT-Projects\FreeSideProjects\gptpdf\venv\Lib\site-packages\gptpdf\parse.py", line 294, in parse_pdf
content = _gpt_parse_images(image_infos, output_dir=output_dir, api_key=api_key, base_url=base_url, model=model, verbose=verbose, gpt_worker=gpt_worker)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\zhang\GPT-Projects\FreeSideProjects\gptpdf\venv\Lib\site-packages\gptpdf\parse.py", line 261, in _gpt_parse_images
f.write('\n\n'.join(contents))
UnicodeEncodeError: 'gbk' codec can't encode character '\u2020' in position 346: illegal multibyte sequence