I was looking to start archiving a few fantia accounts I'm subscribed to but couldn't figure out a way to emulate the folder structure I've been manually using as well as how to scrape all the text I was interested in. On fantia, creators can pay wall specific sections of a post to different tiers. I like to keep each of these sections contained in their own folder; however, when I checked the keywords available for a link I couldn't find anything that would allow me to break up each section.
I tried to find a good example of this from some random post on the front page and this should illustrate what I'm talking about.
Post: https://fantia.jp/posts/1964477, word of warning, this link is NSFW. It's downright impossible to find SFW links on fantia.
> gallery-dl -K "https://fantia.jp/posts/1964477"
Keywords for directory names:
-----------------------------
category
fantia
comment
🐮見放題プランの方向けに3年前の投稿を公開していきます🐮
いつも支援して頂きありがとうございます!
For Unlimited plan subscribers, posts from three years ago are now available! Thank you for your support!
⇩🐧下スクロールでエロ差分🐧⇩
date
2023-05-14 09:00:00
fanclub_id
2931
fanclub_name
🐧軒下の猫屋🐧
fanclub_url
https://fantia.jp/fanclubs/2931
fanclub_user_id
86032
fanclub_user_name
アルデヒド
post_id
1964477
post_title
【見放題プラン】グラブルなまあし部 エウロペさん
post_url
https://fantia.jp/posts/1964477
posted_at
Sun, 14 May 2023 18:00:00 +0900
rating
adult
subcategory
post
tags[N]['name']
グラブル
tags[N]['uri']
/fanclubs/2931/posts?tag=%E3%82%B0%E3%83%A9%E3%83%96%E3%83%AB
Keywords for filenames and --filter:
------------------------------------
category
fantia
comment
🐮見放題プランの方向けに3年前の投稿を公開していきます🐮
いつも支援して頂きありがとうございます!
For Unlimited plan subscribers, posts from three years ago are now available! Thank you for your support!
⇩🐧下スクロールでエロ差分🐧⇩
content_category
thumb
content_filename
date
2023-05-14 09:00:00
extension
png
fanclub_id
2931
fanclub_name
🐧軒下の猫屋🐧
fanclub_url
https://fantia.jp/fanclubs/2931
fanclub_user_id
86032
fanclub_user_name
アルデヒド
file_id
thumb
file_url
https://c.fantia.jp/uploads/post/file/1964477/48a60e5f-1487-4063-b952-1e0e87b49c3d.png
filename
48a60e5f-1487-4063-b952-1e0e87b49c3d
num
1
post_id
1964477
post_title
【見放題プラン】グラブルなまあし部 エウロペさん
post_url
https://fantia.jp/posts/1964477
posted_at
Sun, 14 May 2023 18:00:00 +0900
rating
adult
subcategory
post
tags[N]['name']
グラブル
tags[N]['uri']
/fanclubs/2931/posts?tag=%E3%82%B0%E3%83%A9%E3%83%96%E3%83%AB
And here's an image of what the post sections look like.
Each section can have a title as well as additional comment information, should any be written. If all of this has already been implemented, could someone help me build the extractor config and postprocessor details needed to scrape the contents of the tiers into dedicated folders as well as write a text file to that folder with any relevant info?
I was looking to start archiving a few fantia accounts I'm subscribed to but couldn't figure out a way to emulate the folder structure I've been manually using as well as how to scrape all the text I was interested in. On fantia, creators can pay wall specific sections of a post to different tiers. I like to keep each of these sections contained in their own folder; however, when I checked the keywords available for a link I couldn't find anything that would allow me to break up each section.
I tried to find a good example of this from some random post on the front page and this should illustrate what I'm talking about. Post: https://fantia.jp/posts/1964477, word of warning, this link is NSFW. It's downright impossible to find SFW links on fantia.
And here's an image of what the post sections look like.
Each section can have a title as well as additional comment information, should any be written. If all of this has already been implemented, could someone help me build the extractor config and postprocessor details needed to scrape the contents of the tiers into dedicated folders as well as write a text file to that folder with any relevant info?