mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
11.36k stars 925 forks source link

[Request] skeb.jp #1031

Closed indrakaw closed 1 year ago

biggestsonicfan commented 2 years ago

I realize for the most part skeb.jp is supported, but last night I ran into a user that has a portfolio but no works and I could not download from the portfolio. They have since (between last night and now) added one work. https://skeb.jp/@hataraku73

EDIT: Also for your consideration Google seems to crawl non-watermarked images somehow...

nisehime commented 2 years ago

EDIT: Also for your consideration Google seems to crawl non-watermarked images somehow...

That's interesting @mikf You can compare this two URLs: From google: link From skeb page: link

Basically, the one from google have different http parameters, but if you try to modify them by yourself you will get 403 error, so it doesn't seem to help much. The response to the modified parameters seems to be tied to the s= parameter (md5?), which is somehow generated for those set of parameters. I don't know but what if the algorithm for it is in the page script lol?

Anyway, the sample-less link from google is actually in the page's source code, so there's no some special privileges for google. I guess it might be considered as an option to download, however there's 2 issues:

  1. The sample-less image has smaller resolution.
  2. The sample-less image does not always exists in the page's source.
mikf commented 2 years ago

seems to be tied to the s= parameter (md5?)

The hex-digest length fits, but they are most likely applying some not easy to guess transformation to URL or query string before MD5-ing it.

  1. The sample-less image has smaller resolution.
  2. The sample-less image does not always exists in the page's source.

and 3. the sample-less images got generated with a terrible JPEG compression value: &q=45

nisehime commented 2 years ago
  1. the sample-less images got generated with a terrible JPEG compression value: &q=45

I wouldn't really call it terrible tbh. Some people still might prefer it over the text covering half of the image.

NeroTheYoung commented 1 year ago

EDIT: Also for your consideration Google seems to crawl non-watermarked images somehow...

That's interesting @mikf You can compare this two URLs: From google: link From skeb page: link

Basically, the one from google have different http parameters, but if you try to modify them by yourself you will get 403 error, so it doesn't seem to help much. The response to the modified parameters seems to be tied to the s= parameter (md5?), which is somehow generated for those set of parameters. I don't know but what if the algorithm for it is in the page script lol?

Anyway, the sample-less link from google is actually in the page's source code, so there's no some special privileges for google. I guess it might be considered as an option to download, however there's 2 issues:

  1. The sample-less image has smaller resolution.
  2. The sample-less image does not always exists in the page's source.

How did you get the sample-less url from google?

biggestsonicfan commented 1 year ago

@NeroTheYoung I reverse image searched the image on Google, but the Google url doesn't appear to be valid anymore.

biggestsonicfan commented 1 year ago

Just an FYI: https://twitter.com/skeb_jp/status/1606106596764852225

🚨Attention🚨 Using extensions that scrape or increase access to Skeb is a violation of our terms and policies.

rautamiekka commented 1 year ago

Just an FYI: https://twitter.com/skeb_jp/status/1606106596764852225

🚨Attention🚨 Using extensions that scrape or increase access to Skeb is a violation of our terms and policies.

What else is new ? I mean, you must expect scrapers/downloaders not being allowed, so this is nothing unexpected nor new.

biggestsonicfan commented 1 year ago

I understand, but this is the first official statement I've seen of it.