balta2ar / manuscript-dl

Collection of scripts to download digitized manuscripts from various online libraries
24 stars 4 forks source link

The BL downloader gets stuck in a redirect loop #6

Open allengarvin opened 2 years ago

allengarvin commented 2 years ago

I believe the loop is caused by the lack of a particular session header. I didn't investigate in detail, but it's repeatedly redirecting to http://www.bl.uk/manuscripts/SetupViewerHandler.ashx?[ms identifier]. This page sets up a cookie for ASP.NET_SessionId. I suspect that is needed, but I didn't investigate further.

I was lazy and was able to bypass this by copying the headers from a session with the Chrome developer tools and adding them to the request, after which it worked fine.

Nice script. I had been pulling them via shell script, and using a pnmtile to put them back together.