Szwendacz99 / BookStack-Python-exporter

Customizable script for exporting notes from BookStack through API. Export Pages, Chapters, Books, attachments and images.
MIT License
22 stars 3 forks source link

403 Forbidden #3

Closed brynmoorhouse closed 1 year ago

brynmoorhouse commented 1 year ago

Hi, Can you help me with this - I'm unable to make the call via this (awesome looking) tool, but I can successfully make the request via cURL on the same machine.

INFO :: Getting info about Shelves and their Books DEBUG :: Making http request: https://<hidden for security>/api/shelves?count=50&offset=0 with headers {'Content-Type': 'application/json; charset=utf-8', 'Authorization': 'Token 6Je<rest of id hidden>:<start of secret hidden>3h'} Traceback (most recent call last): File "/home/bryn/bookstackexporter/export.py", line 294, in <module> for shelf_data in api_get_listing('shelves'): File "/home/bryn/bookstackexporter/export.py", line 235, in api_get_listing api_get_bytes(path, count=count, offset=len(result))) File "/home/bryn/bookstackexporter/export.py", line 207, in api_get_bytes with urlopen(request) as response: File "/usr/lib/python3.9/urllib/request.py", line 214, in urlopen return opener.open(url, data, timeout) File "/usr/lib/python3.9/urllib/request.py", line 523, in open response = meth(req, response) File "/usr/lib/python3.9/urllib/request.py", line 632, in http_response response = self.parent.error( File "/usr/lib/python3.9/urllib/request.py", line 561, in error return self._call_chain(*args) File "/usr/lib/python3.9/urllib/request.py", line 494, in _call_chain result = func(*args) File "/usr/lib/python3.9/urllib/request.py", line 641, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 403: Forbidden

and this is it when I run via cURL curl --location --request GET 'https://<hidden for security>/api/shelves?count=50&offset=0' \ --header 'Authorization: Token 6Je<rest of id hidden>:<start of secret hidden>3h' {"data":[{"id":1,"name":"IT Docs","slug":"it-docs","description":"","created_at":"2022-08-26T23:31:26.000000Z","updated_at":"2022-08-27T23:38:58.000000Z","created_by":3,"updated_by":3,"owned_by":3},{"id":4,"name":"Other Companies","slug":"other-companies","description":"","created_at":"2022-08-27T22:45:37.000000Z","updated_at":"2022-08-27T23:39:39.000000Z","created_by":4,"updated_by":4,"owned_by":4},{"id":6,"name":"K8","slug":"k8","description":"","created_at":"2022-08-30T14:29:08.000000Z","updated_at":"2022-09-22T14:26:41.000000Z","created_by":4,"updated_by":4,"owned_by":4},{"id":7,"name":"Software Guides","slug":"software-guides","description":"For Users","created_at":"2022-08-30T14:42:14.000000Z","updated_at":"2022-10-21T15:41:45.000000Z","created_by":4,"updated_by":4,"owned_by":4},{"id":8,"name":"Accounts","slug":"accounts","description":"","created_at":"2022-09-29T16:47:12.000000Z","updated_at":"2023-03-06T23:04:11.000000Z","created_by":4,"updated_by":10,"owned_by":4}],"total":5}

Bit baffled?

Thanks,

Szwendacz99 commented 1 year ago

Currently I cannot recreate the issue with python version 3.11.2 nor with 3.9.16. I see that you probably modified code to have more info in debug output, were there any other changes? I might need some more information:

Also double check token and domain, typos are good in hiding.

brynmoorhouse commented 1 year ago

Hi, I only modified the code once I started having problems - and literally just to ensure that the headers were coming through properly. I've sussed it out, but only once I'd written all the below, so I'm just going to leave it.

It looked like the request is never reaching the server based on a lack of an access/error log, but that begs the question, where on earth is it getting a 403 from? It then occurred to me that it's using a cloudflare tunnel. I logged in and the dashboard is showing a shed load of threats at the times I've tried to use the exporter. Threat Type "Bad Browser". I stuck an IP allow rule in, and it went through straight away.

So, the issue can be closed, but it might be worth putting a note on the readme that this is an issue if using CloudFlare (or find a way to prevent CloudFlare thinking it's a bad browser).

Thanks for the quick response by the way - really appreciate that.

Szwendacz99 commented 1 year ago

Oh, that is a valuable information since CloudFlare is widely used. It seems that It filters HTTP requests by User agent and, since this exporter uses default python user agent header, for CloudFlare it looks just like a typical hacker-made script for evil purposes. I probably should add a parameter to set custom user agent for requests so users can easily avoid such problems (Hopefully this will be enough). Notice in readme about that is also a good idea.