lovasoa / dezoomify

Dezoomify is a web application to download zoomable images from museum websites, image galleries, and map viewers. Many different zoomable image technologies are supported.
https://dezoomify.ophir.dev
GNU General Public License v2.0
672 stars 75 forks source link

[new site support] https://www.nids.mod.go.jp/military_history_search/ #780

Closed AmiralCrapaud closed 9 months ago

AmiralCrapaud commented 9 months ago

Replaces the other thread (sorry, hadn't made that one properly)

Site name and desciption

What is the name and the URL of the site you would like dezoomify to support ?

Library & Historical Records Search System of the Japanese Military Archives https://www.nids.mod.go.jp/military_history_search/

Why is this site of particular interest ? The more popular the site, the more chance you have to see dezoomify supporting it someday.

This is the virtual reading room of the website of the historical archives fund of the Japanese Ministry of Defense. The items of interest are from a copyright-free series of official history pertaining Japanese war activities during WW2. These documents are very much sought after by a lot of people in our field, as they were never published in English and are very hard to find in print. It's pretty niche, but I can tell you people will be grateful.

Example URLs

https://www.nids.mod.go.jp/military_history_search/SoshoAppendixView?no=049&f=049_344.jpg https://www.nids.mod.go.jp/military_history_search/SoshoAppendixView?no=049&f=049_346.jpg https://www.nids.mod.go.jp/military_history_search/SoshoAppendixView?no=049&f=049_347.jpg

Other examples from another volume of the series https://www.nids.mod.go.jp/military_history_search/SoshoAppendixView?no=010&f=010_359.jpg https://www.nids.mod.go.jp/military_history_search/SoshoAppendixView?no=010&f=010_360.jpg https://www.nids.mod.go.jp/military_history_search/SoshoAppendixView?no=010&f=010_361.jpg

Current error message

Describe the error dezoomify currently gives you when you try to dezoomify images from this site.

Error: Unable to find a proper dezoomer for: https://www.nids.mod.go.jp/military_history_search/SoshoAppendixView?no=049&f=049_345.jpg

(https://dezoomify.ophir.dev/dezoomers/automatic.js:30)

Doesn't work with -rs either, but it could be human error on my part Any clue? Is that due to an extra layer of protection, or it's just my being very bad at it (which is absolutely most often the case).

Cheers and big thanks in advance! Merci beaucoup ^^

Benomrans commented 9 months ago

Hello, This site does not need dezoomify as it usues full images with openseadragon not tiled images. Images urls might be obtained from the network section at the browser console and saved from within the section (because a Cors policy is applied), then add .jpg to the downloaded file for example: the url https://www.nids.mod.go.jp/military_history_search/SoshoAppendixView?no=010&f=010_361.jpg has the image url : https://www.nids.mod.go.jp/military_history_search/GetImage?sa=010/appendix/010_361.jpg

(I'm using firefox)

AmiralCrapaud commented 9 months ago

Thank you kindly for your time @Benomrans Unfortunately on my end, I can only achieve on two machines, both using FF, that kind of result image

Another angle, if I've missed something image

Ultimately, each time I try to save the response, even when I renew the query, only about 1/3 of the image will DL, and I'll end up with a 1Mb file out of a 3Mb original file (to give an order of magnitude). I tried several ways of doing that, but I am probably doing it wrong? If I hover over the source, it clearly only shows what I ultimately get.

Sorry to bother you again, but if you see the answer to that issue on the screen, a big thanks in advance!

Benomrans commented 9 months ago

@AmiralCrapaud Well, in this case we would need an image downloader that allow applying a referer header. the Downthemall Add-on at firefox would do that, as per this pic :

10-361

(each image url with it's page url as referer).

AmiralCrapaud commented 9 months ago

I was honestly losing hope because my DownThemAll extension just did not manage to do it the first time around. Only realized then that it had refused to update since 2019 and that I had to reinstall it manually to its latest version - and now it works! Big thanks @Benomrans !!