gyng / save-in

WebExtension for saving media, links, or selections into user-defined directories
MIT License
205 stars 25 forks source link

Headers not being sent when downloading (Broken Files for Pixiv.net) #46

Closed Skyleaf closed 6 years ago

Skyleaf commented 6 years ago

Sometime recently, Save In stopped properly saving images from pixiv.net. I know it happened sometime after December 4th (that's the newest normal file), but I'm not exactly sure when. The next time I saved anything from there was on December 15th. It will act like it saved it fine, but regardless of the original file's size, it comes out as a 162 byte file.

This doesn't seem to apply to all images on Pixiv. For instance, the thumbnails for the previous and next images at the top of the page save just fine. It also doesn't seem to be a problem on any other website. I can save the images with Save Image As just fine, so I'm not really sure what is going on. Assuming it's not a problem local to me, you can test it out by trying to save the main image at https://www.pixiv.net/member_illust.php?mode=medium&illust_id=43773211.

My current rename and route settings are:

// Remove :large from Twitter images filename: (.*)(:|_)large sourceurl: pbs.twimg.com capture: filename into: ^Current Sort/[:pagedomain:] :$1:

// Matches images with no dot extension in filename filename: ^[^.]+[^.]{0,5}$ mediatype: image into: ^Current Sort/[:pagedomain:] :filename:.jpg

// Add domain to filename sourceurl: .* into: ^Current Sort/[:pagedomain:] :filename:

gyng commented 6 years ago

It seems to be fine on the first save (even for the large images), but gets a 403 forbidden on subsequent ones.

I don't know why, but it isn't using the cached image on subsequent saves. When it fires off a new request the headers (pixiv seems to need referer) are all missing, causing the download to receive a 403 forbidden.

The referer header cannot be set through downloads.download as it's a restricted header.

There's a solution at https://stackoverflow.com/questions/20579112/send-referrer-header-with-chrome-downloads-api but requires the addition of a content script. I will look into this when I eventually get the time to add a content script.

Unfortunately there is no easy way around this with what the web extensions API provides. A workaround right now is to refresh the page before saving, but it's definitely not ideal.

Skyleaf commented 6 years ago

Sorry for the extremely late reply. Pneumonia kinda took me out for a bit. Anyways! That's quite unfortunate that it's not an easy fix, especially since I use Pixiv so much.

I did try your workaround just now and it doesn't seem to work for me. I tried it in a variety of ways with no luck. Regular refresh, refresh (no cache), turned off saving based on routing rules exclusively, even tried to save an image on first load... and I did it on a different image for each test, with no luck. Consistently getting those empty files. Not sure if maybe I have different browser settings than you or not. I can download the images through other means, but they aren't as fast or easy as this extension. Regardless, I'll just have to be patient it seems. I'm just glad there is some kind of fix, even if it'll take time to implement.

Currently on Firefox 58 (64-bit), Windows 10 Home, and used these images for my testing:

https://www.pixiv.net/member_illust.php?mode=medium&illust_id=43256548 https://www.pixiv.net/member_illust.php?mode=medium&illust_id=43143531 https://www.pixiv.net/member_illust.php?mode=medium&illust_id=42261008 https://www.pixiv.net/member_illust.php?mode=medium&illust_id=41303946

gyng commented 6 years ago

There's an "Enable fetching via content script" option in 2.4.0 that seems to work with Pixiv. Mind trying it out and check if it works?

Skyleaf commented 6 years ago

Tried it out on about half a dozen images and it seems to be working great so far. Thanks!