mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
12.1k stars 983 forks source link

[Site Request] Itaku.ee #1842

Closed MarqFJA87 closed 2 years ago

MarqFJA87 commented 3 years ago

Example URL (NSFW): https://itaku.ee/profile/tail-blazer/gallery

barbedknot commented 2 years ago

https://itaku.ee/profile/sqoon https://itaku.ee/profile/purpsi

I've been really bad about opening itaku profiles, because I know gallery-dl can't rip them yet. But here are two I would've ripped if itaku was supported.

Itaku support never. I remember waiting for pillowfort the same way. Feels bad, man.

barbedknot commented 2 years ago

I don't know when, but I just noticed itaku support got added to postybirb.

https://www.postybirb.com/

PostyBirb lets artists streamline posting their art to multiple gallery sites. This is the only "bump" I can make on the topic that can add new information, unless it's praising/criticizing itaku as a platform. But I don't know much about it anyway, cause with gallery-dl not supporting it, I can't rip from it. So that's tuff.

But if nothing else, even though they make viewing nsfw without an account as annoying as possible (doesn't even show up as a thumbnail until you click "Maturity" to open a dropdown, you have to check "Questionable", "NSFW", and "Extreme" individually to have all thumbnails show up, the "thumbnails" you see after doing so don't even preview the image, the actual screen real estate used for browsing galleries is only a third of your screen's width and just over half its height, and your "thumbnail" settings are lost on page refresh)... at least you can actually view nsfw without an account. The same can't be said for some other sites. And itaku appears to host images in their original quality. The "Show Uncompressed" button seems to convey that pretty confidently. I've seen a 7mb image in Angstrom's gallery, for example.

From reading their rules, I can semi-confidently say they even host loli/shota/cub. Their rules page professes conformity to "Finnish law". They specifically isolated "realistic human characters" when clarifying the art they prohibit, and they even added in bold: "REMINDER: Real life pornography or suggestive content in any form is prohibited!". This is far removed from platforms like discord conflating drawings to "children" in their guidelines, explicitly disallowing hosting it, even explicitly disallowing linking it, and going so far to use the word "cub" when speaking of art they disallow. Considering the contrast, I think it's clear itaku hosts it, but I don't have anything to cite to support that, since I haven't even ripped from the site any. I can't immediately find how to search either, since maybe that feature is locked behind being logged-in. Not that a drawing conceivably falling through the cracks is a confident source. But, the point is, this trait in art gallery sites is getting rarer to find in current year. Especially being able to view it without an account. So.

Basically, itaku support never. Feels bad, man.

God-damnit-all commented 2 years ago

It seems that a lot of artists are migrating to itaku.ee after pillowfort.social's development disappointed them. Annoyingly, the site uses lazy-loading and doesn't have an API. Figuring out how to scrape from it isn't going to be easy.

0fbcb238c0 commented 2 years ago

Itaku uses Django REST framework, maybe that can be scraped? I guess you would just need to iterate through that? I'll try to do it myself but don't really have experience programming.

Example URL: https://itaku.ee/api/galleries/images/?cursor=cj0xJnA9MjAyMS0wOC0yNSswNyUzQTAxJTNBNDYuMTg4NjY5JTJCMDAlM0EwMA%3D%3D&date_range=&ordering=-date_added&owner=1213&page=2&page_size=300&visibility=PUBLIC&visibility=PROFILE_ONLY

The listed sizes (sm, lg, xl) are not for the original file, the URL end must be replaced by FILENAME.EXTENSION, without sm, lg or xl. URL/FILENAME/xl.jpg -> URL/FILENAME.EXTENSION

There's also https://itaku.ee/api/ For accessing data pertaining image/video posts on itaku one simply has to had "/api/galleries/" to a link like so: https://itaku.ee/api/galleries/images/postnumbers/

From there the files, thumbnails etc. can be downloaded.

To get posts by a certain profile: https://itaku.ee/api/galleries/images/?owner=4997 Since not all posts are shown and this is separated in pages, iteration should be used here Attention must be paid to the fact that itaku differentiates between SFW and NSFW profiles, so a single user can have multiple IDs

mikf commented 2 years ago

Basic support is done (https://github.com/mikf/gallery-dl/commit/fa902cd54d76dc5a25a695729d71efa5b3721cfa).

It only supports gallery URLs (https://itaku.ee/profile/piku/gallery) and image URLs (https://itaku.ee/images/100471) at the moment. Also, metadata is a mess, especially tags.

Let me know what else you need, and please post a video example so that that can be supported as well. I haven't been able to find any by going through several artist galleries.

The listed sizes (sm, lg, xl) are not for the original file, the URL end must be replaced by FILENAME.EXTENSION, without sm, lg or xl. URL/FILENAME/xl.jpg -> URL/FILENAME.EXTENSION

I've noticed. The code does an extra request to /api/galleries/images/ID to compensate for that. Its result also has a lot more metadata entries than what /api/galleries/images/?owner=4997 provides.

barbedknot commented 2 years ago

Basic support is done (fa902cd).

It only supports gallery URLs (https://itaku.ee/profile/piku/gallery) and image URLs (https://itaku.ee/images/100471) at the moment. Also, metadata is a mess, especially tags.

Let me know what else you need, and please post a video example so that that can be supported as well. I haven't been able to find any by going through several artist galleries.

The listed sizes (sm, lg, xl) are not for the original file, the URL end must be replaced by FILENAME.EXTENSION, without sm, lg or xl. URL/FILENAME/xl.jpg -> URL/FILENAME.EXTENSION

I've noticed. The code does an extra request to /api/galleries/images/ID to compensate for that. Its result also has a lot more metadata entries than what /api/galleries/images/?owner=4997 provides.

https://itaku.ee/images/19465

video example

MarqFJA87 commented 2 years ago

Basic support is done (fa902cd).

It only supports gallery URLs (https://itaku.ee/profile/piku/gallery) and image URLs (https://itaku.ee/images/100471) at the moment. Also, metadata is a mess, especially tags.

Let me know what else you need, and please post a video example so that that can be supported as well. I haven't been able to find any by going through several artist galleries.

The listed sizes (sm, lg, xl) are not for the original file, the URL end must be replaced by FILENAME.EXTENSION, without sm, lg or xl. URL/FILENAME/xl.jpg -> URL/FILENAME.EXTENSION

I've noticed. The code does an extra request to /api/galleries/images/ID to compensate for that. Its result also has a lot more metadata entries than what /api/galleries/images/?owner=4997 provides.

I assume that "basic support" means that it's limited to downloading the files in the most simple manner, without respect for any folder organization in the gallery?

MarqFJA87 commented 2 years ago

Okay, so I tried to batch-download multiple galleries via putting the appropriate URLs in separate lines within a text file, and gallery-dl just downloaded the first one in the list before acting as if it's done.

mikf commented 2 years ago

I assume that "basic support" means that it's limited to downloading the files in the most simple manner, without respect for any folder organization in the gallery?

You can put each image into a separate section folder with the right directory settings, but that only works for 1 section. Images that are present on multiple sections will only be put in the first one.

    "directory": ["{category}", "{owner_username}", "{sections[0]:?//}"]

Okay, so I tried to batch-download multiple galleries via putting the appropriate URLs in separate lines within a text file, and gallery-dl just downloaded the first one in the list before acting as if it's done.

What exactly did you try to do? --input-file on a file with several URLs, one per line, works as excepted when testing it myself:

$ cat galleries 
https://itaku.ee/profile/or-fi-s/gallery
https://itaku.ee/profile/xxtamxx/gallery
https://itaku.ee/profile/modux/gallery

$ gallery-dl -i galleries --range 1
[1/3] https://itaku.ee/profile/or-fi-s/gallery
/tmp/itaku/or-fi-s/127066 Sippy.png
[2/3] https://itaku.ee/profile/xxtamxx/gallery
/tmp/itaku/or-fi-s/127066 Sippy.png
[3/3] https://itaku.ee/profile/modux/gallery
/tmp/itaku/or-fi-s/127066 Sippy.png
MarqFJA87 commented 2 years ago

I have a gallery-dl_list.txt file that I use for batch-downloading of multiple specific images/galleries, by putting the URLs in one per line arrangement, and it works perfectly with Twitter, DeviantArt, and other sites. Itaku deviated from this when I tried that, fetching only the contents of the first URL.

Okay, did a test with two, much smaller galleries, and discovered the weird thing that's happening: the second URL is being detected, but for some unfathomable reason, the actual images being downloaded are those of the first URL! (Naturally, the app blitzes through them all, as they're already downloaded and thus the files are present.)

Hrxn commented 2 years ago

I have a gallery-dl_list.txt file that I use for batch-downloading of multiple specific images/galleries, by putting the URLs in one per line arrangement, and it works perfectly with Twitter, DeviantArt, and other sites. Itaku deviated from this when I tried that, fetching only the contents of the first URL.

How's that supposed to work? Do you think the code to read the URLs from a text file is somehow suddenly different?

mikf commented 2 years ago

Okay, did a test with two, much smaller galleries, and discovered the weird thing that's happening: the second URL is being detected, but for some unfathomable reason, the actual images being downloaded are those of the first URL! (Naturally, the app blitzes through them all, as they're already downloaded and thus the files are present.)

Ahh, that explains it. I added some caching and forgot to differentiate between different users, my bad. Fixed in https://github.com/mikf/gallery-dl/commit/36ead4554649e3dee27f69764905eb1f50fc606a.

This bug is even visible in my example in https://github.com/mikf/gallery-dl/issues/1842#issuecomment-1172457990 (Sippy.png x3), but I somehow didn't notice.