mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
11.76k stars 963 forks source link

Only save the links from the Kemono post as a text file #2598

Closed a84r7a3rga76fg closed 2 years ago

a84r7a3rga76fg commented 2 years ago

This will save everything from the post but I want to only save the links from the post, can someone please tell me how to do that

            "postprocessors": [
                {
                    "name": "metadata",
                    "event": "post",
                    "filename": "{id}.txt",
                    "mode": "custom",
                    "format": "{content}\n{embed[url]:?/\n/}"
                }
            ]
Fukitsu commented 2 years ago

Use the --no-download flag

a84r7a3rga76fg commented 2 years ago

I am. By everything, I meant all of the post's text. I just want it to save the links from the post's text.

Fukitsu commented 2 years ago

Oh, sorry. Then try with just "format": "{embed[url]:?/\n/}"

mikf commented 2 years ago

If you only want all URLs from content, try something like "format": "\fE '\\n'.join(re.findall(r'https?://[^\\s\\'\"]+', content))" but be aware that this will only find full URLs starting with https://.

There are artists who split their links into several chunks like

mega

+

.nz/file/

+

Idg12Rbb
a84r7a3rga76fg commented 2 years ago

I forgot about that, I've also seen them put the links like that which makes no sense and I hope they lose subscribers for it, I'll be closing this since it's better to save all of the text in the post.