urschrei / pyzotero

Pyzotero: a Python client for the Zotero API
https://pyzotero.readthedocs.org
Other
870 stars 96 forks source link

.txt files in attachment are dumped in an unreadable format #166

Closed beastraban closed 11 months ago

beastraban commented 11 months ago

I have an attachment type item which is a .txt file (in hebrew) Viewing it in Zotero works fine.

However, when either reading from the file using: with open(zot.file(key)) as f: text=f.read()

or alternatively dumping it into a local directory with: zot.dump(key)

with the first case I get the following: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xab in position 10: invalid start byte

When opening the .txt file in notebook - the text is obviously garbled (it LOOKS like utf-8 text) and when using Mozilla firefox to open the .txt file and switch encoding - it can't.

pyzotero version 1.5.16

urschrei commented 11 months ago

Can you put a copy of the file somewhere so I can try to reproduce this?

beastraban commented 11 months ago

https://www.dropbox.com/scl/fi/2xb7wlf0lekekt8i784ug/1.txt?rlkey=0o5stj7n218g9m6ow32k0ymob&dl=0 That's the original file

https://www.dropbox.com/scl/fi/oob4lkgivs5vsi228kq2j/.txt?rlkey=257njnsf3f948wozy1dzk17xv&dl=0 That's the resulting file after zot.dump()

https://www.dropbox.com/scl/fi/2xb7wlf0lekekt8i784ug/1.txt?rlkey=0o5stj7n218g9m6ow32k0ymob&dl=0 That's the zotero item itself (exported)

Thanks :)

urschrei commented 11 months ago

Fixed in v1.5.17, on PyPI.

beastraban commented 11 months ago

Thanks. However. for some reason now the "dump" and "file" calls can find files. They yield the following output:

>>zot.dump('SM7TPZLG')

Traceback (most recent call last):

  File ~\anaconda3\lib\site-packages\pyzotero\zotero.py:433 in _retrieve_data
    self.request.raise_for_status()

  File ~\anaconda3\lib\site-packages\requests\models.py:1021 in raise_for_status
    raise HTTPError(http_error_msg, response=self)

HTTPError: 404 Client Error: Not Found for url: https://api.zotero.org/%5Cusers%5C2999351%5Citems%5CSM7TPZLG

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  Cell In[12], line 1
    zot.dump('SM7TPZLG')

  File ~\anaconda3\lib\site-packages\pyzotero\zotero.py:741 in dump
    file = self.file(itemkey)

  File ~\anaconda3\lib\site-packages\pyzotero\zotero.py:237 in wrapped_f
    item = self._retrieve_data(fixed_path)

  File ~\anaconda3\lib\site-packages\pyzotero\zotero.py:435 in _retrieve_data
    error_handler(self, self.request, exc)

  File ~\anaconda3\lib\site-packages\pyzotero\zotero.py:1667 in error_handler
    raise error_codes.get(req.status_code)(err_msg(req)) from exc

ResourceNotFound: 
Code: 404
URL: https://api.zotero.org/%5Cusers%5C2999351%5Citems%5CSM7TPZLG
Method: GET
Response: <h1>Not Found</h1>
<p>The page you requested could not be found.</p>

Whereas in the previous version 1.5.16 it does find them. the 'item' call works fine.

Thanks