KJHJason / Cultured-Downloader

A project to automate the process of downloading images and other attachment files from platforms like Fantia and more!
GNU General Public License v3.0
73 stars 7 forks source link

Feature Request: More Filter Support #228

Closed FNSOIDATHQ closed 4 months ago

FNSOIDATHQ commented 6 months ago

Which program are you suggesting this feature for?

Cultured Downloader (Go)

Feature Description

At present, it seems that only the page number filter is available. Is it possible to provide more complex filters? Or allow users to manually create filtering rules by regular expressions(or something else)?

Screenshot (Optional)

No response

KJHJason commented 6 months ago

Do you have an example?

FNSOIDATHQ commented 6 months ago

For filters: Maybe support filter date,file size,files type. Users may wish to fine-tune what is downloaded based on post's published date, or filter specific artist's work using file suffix/size.

For regular expressions: My idea is to allow users to access links, web pages or metadata, and then filter the files that meets the requirements through regular expressions. But I'm completely new to automatic downloaders. I don't know if this way is actually feasible.

KJHJason commented 6 months ago

For filters: Maybe support filter date,file size,files type. Users may wish to fine-tune what is downloaded based on post's published date, or filter specific artist's work using file suffix/size.

Hmm, filter for date and file type is a good idea and should be feasible for all platforms.

However, for the file size filter, I believe as of now, Pixiv Fanbox is the only one that is not feasible. Pixiv Fanbox's server does not return the content length in the response header so it would be impossible to tell the file size without downloading the file itself to the local drive first. If I were to implement it, I would only implement it for those that have the content-length in the response header (Pixiv, Fantia, Kemono).

For regular expressions: My idea is to allow users to access links, web pages or metadata, and then filter the files that meets the requirements through regular expressions. But I'm completely new to automatic downloaders. I don't know if this way is actually feasible.

For user-inputted regular expressions, it is possible to implement it. However, I'm not sure what you meant by "allow users to access links, web pages or metadata, and then filter the files that meets the requirements". Do you mean like regex for the filename?

FNSOIDATHQ commented 6 months ago

Good news! Well, file size is not that important than other two, so i guess it won't be a big problem to not have it in fanbox.

Do you mean like regex for the filename?

I think yes? I assume that the downloader will first scan all the posts, sort out the download links. Then it request the download from the server, and server will return some kind of metadata? If this is the case, what I mean is that the regex can probably be applied at any stage before getting the final download link? which would make it a very powerful custom filter.

KJHJason commented 6 months ago

I think yes? I assume that the downloader will first scan all the posts, sort out the download links. Then it request the download from the server, and server will return some kind of metadata? If this is the case, what I mean is that the regex can probably be applied at any stage before getting the final download link? which would make it a very powerful custom filter.

Hmm, what kind of metadata are you looking at? Post title?

FNSOIDATHQ commented 6 months ago

I think the post title, text information in post content and file name may be useful. for example, some artists would publish a single work in chapters. We could limit downloads to include name of the work to download all needed posts. Considering different habits of each artists, the key information(name of work) may hide in anywhere, but highly possible in above three places.