texworld / betterbib

:green_book: Command-line tools for bibliographies.
816 stars 42 forks source link

Bug? UnicodeDecodeError: 'gbk' codec can't decode byte 0x81 in position 1072: illegal multibyte sequence #256

Closed ghweili closed 7 months ago

ghweili commented 2 years ago

Got the following Error:

There was an error when parsing D:\temps\PoPCites.txt
Traceback (most recent call last):
  File "d:\python\anaconda3\lib\site-packages\pybtex\database\input\__init__.py", line 55, in parse_file
    self.parse_stream(f)
  File "d:\python\anaconda3\lib\site-packages\pybtex\database\input\bibtex.py", line 411, in parse_stream
    text = stream.read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0x81 in position 1072: illegal multibyte sequence

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "d:\python\anaconda3\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "d:\python\anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "D:\python\anaconda3\Scripts\betterbib.exe\__main__.py", line 7, in <module>
    sys.exit(main())
  File "<string>", line 76, in main
  File "<string>", line 21, in run
  File "<string>", line 597, in bibtex_parser
  File "<string>", line 593, in bibtex_parser
  File "d:\python\anaconda3\lib\site-packages\pybtex\database\input\__init__.py", line 57, in parse_file
    raise PybtexError(six.text_type(e), filename=self.filename)
pybtex.exceptions.PybtexError: 'gbk' codec can't decode byte 0x81 in position 1072: illegal multibyte sequence

I tried multiple files, all of them encounter this error. Is there anyway to handle this?

nschloe commented 2 years ago

This needs a minimal example file.

ghweili commented 2 years ago

Below is a an example, which is created by "Publish or Perish" based on a search from google scholar. I wanted to use this software to export items from google scholar in batch, but somehow some of the fields are truncated. As a result, I'm trying to use betterbib to update these references.

PoPCites2.txt

Moreover, I'd like to point out that, I also tried other txt files, which also encountered the same error.

Below is the link for the said software: https://harzing.com/resources/publish-or-perish

nschloe commented 1 year ago

This really is a PybTeX error, and not much can be done on a betterbib level so don't expect a fix too soon. Let's keep it open though.

nschloe commented 7 months ago

This is fixed now. Thanks again for the report!