Kagami / tistore

:camera: Tistory photo grabber
24 stars 10 forks source link

Recent site changes cause program to not work #6

Closed Silenciuse closed 5 years ago

Silenciuse commented 5 years ago

Hey, they changed their URLs again, this time to something completely different.

Here's some examples, same deal as last time where "?original" can be added to the end: https://k.kakaocdn.net/dn/bcF4AQ/btqtUkQPhad/g6Sl50hXCukzakdKrm3K4k/img.jpg https://boxgame.tistory.com/2475

Apparently they can have download buttons now too, although this is the only blog I've seen with it enabled: https://farbutnear0305.tistory.com/64

Also every file now downloads with the same name, which is really annoying.

Thanks for creating and maintaining this program.

Kagami commented 5 years ago

Thanks, I will look into it.

Kagami commented 5 years ago

Thanks for the second link, it shows how make it return Content-Disposition header:

curl -I 'https://k.kakaocdn.net/dn/uhnRH/btqtUjkd9re/P18KTYzrsiiS7qimPPui2K/img.jpg?attach=1&nm=HappyIreneDay201.jpg'

Funny that it will return anything you pass in nm parameter. Should work fine with current architecture of tistore though.

Also seems like old blogs haven't migrated to the new link scheme, so we need to support both options.

Kagami commented 5 years ago

Seems like kakaocdn servers violate RFC and return Content-Disposition in the wrong format so aria2c doesn't accept that.

Compare:

$ curl -sI 'https://t1.daumcdn.net/cfile/tistory/99F063505B1ABCFE22?original' | grep -i disposition
content-disposition: inline; filename="DSC_8301.jpg"; filename*=UTF-8''DSC_8301.jpg
$ curl -sI 'https://k.kakaocdn.net/dn/uhnRH/btqtUjkd9re/P18KTYzrsiiS7qimPPui2K/img.jpg?attach=1&nm=HappyIreneDay2019.jpg' | grep -i disposition
Content-Disposition: attachment; filename="HappyIreneDay2019.jpg"; filename*="UTF-8''HappyIreneDay2019.jpg"

Note the extra quotes in UTF-8 filename in second case.

I can make patched version of aria2c binary but that feels very hackish. Or maybe I should write to Kakao devs and ask them to fix the header…

Kagami commented 5 years ago

Created the issue here: https://devtalk.kakao.com/t/wront-format-of-content-disposition-header-on-kakaocdn-net/74282

Kagami commented 5 years ago

Wow, I can't believe it, but they actually gave pretty useful workaround that quickly. Instead of

https://k.kakaocdn.net/dn/uhnRH/btqtUjkd9re/P18KTYzrsiiS7qimPPui2K/img.jpg?attach=1&nm=HappyIreneDay2019.jpg

you can use

https://k.kakaocdn.net/dn/uhnRH/btqtUjkd9re/P18KTYzrsiiS7qimPPui2K/HappyIreneDay2019.jpg?knm=img.jpg

which works perfectly fine with aria2c.

I suppose it's not documented anywhere but should work perfectly fine for our needs. Just need to rewrite URL a bit. Hope it won't have issues with escaping/Unicode.

Kagami commented 5 years ago

Please try 0.5.2 release. It works with your examples but escaping/unicode handling might be not 100% correct, so I need some feedback.