mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
11.4k stars 931 forks source link

Adding comments in `config.json` #4746

Open gaiking-uk opened 10 months ago

gaiking-uk commented 10 months ago

Question/Issue Summary

Background and further info

I believe adding comments to the config.json file is supported(?) For the last year or so, I have been adding comments to my config.json file in the following way...

{
    "extractor": {
        "instagram": {
            "#": "This is a comment",
        }
    }
}

However, I recently got an error when I tried to run gallery-dl as it seemed like it didn't like something in one of my comments...

> gallery-dl https://www.instagram.com/reel/Cy3KNCAuunZ
[config][error] JSONDecodeError when loading 'gallery-dl\config.json': Invalid \escape: line 107 column 82 (char 3364)

The offending line in my config.json line appears to be...

"#": "Full command: gallery-DL 'https://instagram.com/stories/foo' -D 'D:\Downloads\Instagram\foo\Stories' --mtime-from-date",
                                                                        ā¬†ļø Based on error pos, it is the \ in the filepath

PS: Ignore if col position doesn't match in example, I simplified path for privacy + ease of explaining

I tried to escape the \ by doubling it, e.g. D:\\Downloads\\Instragram\\... etc but then got a different error from gallery-dl...

> gallery-dl https://www.instagram.com/reel/Cy3KNCAuunZ
[instagram][error] DirectoryFormatError: Applying directory format string failed (NameError: name 'extension' is not defined)

Can anyone help me out please? Thanks!

JSouthGB commented 10 months ago

JSON doesn't support comments. It looks like you're already using the work around. Perhaps try something like in the stackoverflow answer.

mikf commented 10 months ago

[instagram][error] DirectoryFormatError: Applying directory format string failed (NameError: name 'extension' is not defined)

{extension} cannot be used for building directory paths. For reasons. You should use a classify post processor to put files into sub-directories according to their filename extension.

gaiking-uk commented 10 months ago

@JSouthGB: JSON doesn't support comments. It looks like you're already using the work around. Perhaps try something like in the stackoverflow answer.

Sure, I know that while jsonc supports comments, strict json doesn't... which speaking purely personally is one of things I don't like when using json for things like config files (compared to using for things like API calls/responses) as I just think the idea of having a really useful (but also long/complex) config files that you can't make any notes in, isn't a great option.

But again, just to be super clear, that is just a pet peve of mine that I have with json, NOT a slight against gallery-dl... overall I think gallery-dl is great and I know tons of other apps use json for their config files too.

Back to your comment - thanks for the stackoverflow link, will check that out and see how I can apply it... Also just FYI in terms of my current 'workaround' I am using, TBH I thought I got this from some example of the gallery-dl github (but based on your comment, guess not, lol... in that case the other main site I've used to help piece together my config file is: https://manpages.ubuntu.com/manpages/impish/man5/gallery-dl.conf.5.html

gaiking-uk commented 10 months ago

@mikf: You should use a classify post processor to put files into sub-directories according to their filename extension.

Sure, I appreciate what you're saying (and actually do at least partially, see below) but TBH my point was more that gallery-DL was erroring because it was trying to parse a character that in my view I'd already commented out and my intention was never for gallery-dl to try and parse this string ever.

PS - In the case of the comment above, this was basically a note to myself for how to do a "one-off" download for something thing like an instagram story (which I don't generally do) and so haven't written any specific config for as I generally didn't think to or (more likely) know that was an option.

PPS - In terms of the classification I have done, here is except from the earlier part of my config.json...

{
    "extractor": {
        "base-directory": "[redacted]",
        "parent-directory": false,
        "postprocessors": [
            {
                "name": "metadata",
                "mode": "json",
                "event": "file",
                "directory": "JSON"
            },
            {
                "name": "classify",
                "mapping": {
                    "Videos": [
                        "mp4",
                        "mkv",
                        "webm",
                        "flv",
                        "ogv",
                        "wmv",
                        "avi",
                        "mpg",
                        "mpeg",
                        "3gp",
                        "vob",
                        "ts"
                    ]
                }
            }
        ]
    }
}

As I've user gallery-dl over time, I've tried to improve my config file and gallery-dl usage but am aware that my config has been patched together from various examples and is very probable sub-optimal, so if there's any other advice you have based on the above (or don't mind me pasting my full config file for review) am more than happy to get feedback and understand where it can be improved.

gaiking-uk commented 10 months ago

TLDR: Can comment support be added please?

@JSouthGB - I read through the stackoverflow answer and seems like the... "#" : "This is how I currently write comments, based on what I've seen elsewhere" ... is basically the only way to add comments to json.

As I said above I personally think the ability to make notes in config files, or say to "comment things out" when testing config changes for example is very useful.

I recognise and appreciate that comments aren't part of the json standard (as it was written purely as a data language), but wonder if it would be possible to ask for some limited comment acceptability to be added to gallery-dl. Based on reading stackoverflow, it seems like the two potential best options for doing this are either:

  1. A specific, designated comment tag -- for example "#gdl-comment": "This is a comment" that gallery-dl knows to ignore when parsing the config.json file
  2. Alternatively (although am admittedly less sure about this).. including JSON.minify into gallery-dl as this "strips out comments and whitespace", so could do this before parsing the config.json file.

PS - If you agree comment support would be useful to add and can be done without too much difficulty, but would prefer me to close this "question" and put the above in a separate "request" ticket, am happy to do that to make things easier to track/manage -- just let me know šŸ‘šŸ¼

Hrxn commented 10 months ago

JSON doesn't have comments because it is designed to be as simple as possible, making the implementation of a parser in any other programming language straightforward and robust.

"#" : "This is how I currently write comments, based on what I've seen elsewhere"

So, technically speaking, this isn't even a comment. It's a k-v-pair, with the keyname # and the sentence that follows as the value. gallery-dl reads this, but since it cannot assign the key # to any internal identifier it understands, simply nothing happens here.

BTW, if you use VSCode to edit JSON files (which you probably should, because it's actually a great editor, with JSON support out-of-the-box without any extensions), using this format will give you warnings (Duplicate object key). Also applies to docs/gallery-dl-example.conf, for example. But this is not a problem for gallery-dl, because, again, there's no match for these key names so they are silently discarded. These warnings can be prevented, though, by simply avoiding duplicate key names, for example by continuously numbering all comments:

        "#_comment_01": "set global archive file for all extractors",
        "#_comment_02": "add two custom keywords into the metadata dictionary",
        "#_comment_03": "these can be used to further refine your output directories or filenames",
        "#_comment_04": "make sure that custom keywords are empty, i.e. they don't appear unless specified by the user",

As I've user gallery-dl over time, I've tried to improve my config file and gallery-dl usage but am aware that my config has been patched together from various examples and is very probable sub-optimal [..]

I'm not sure if I can follow here.. You can, as suggested, use VSCode for editing JSON files, and there are web tools like https://www.jslint.com/, which is probably the strictest JSON linter out there. "Strict" being a very relative term here, because whitespace has no significance at all in JSON.

I don't understand what the goal is supposed to be here? Either gallery-dl can parse your config file, or it can't.

Your first issue in the original comment was, that if you want to use the backslash in a string, you have to use \\ because \ is the escape character in JSON, but you already found that out, and the subsequent error had nothing to do with JSON - or your config file.

As I said above I personally think the ability to make notes in config files, or say to "comment things out" when testing config changes for example is very useful.

I don't really agree. If you want to test config changes, write additional configs to override the setting to test and use it ad-hoc.

-c, --config FILE           Additional configuration files

Or, maybe even simpler, test the option values by setting them on the command-line, and if you're satisfied with the result, simply save your preference in the config file.

To be honest, I don't see the need for any comments whatsoever in a config file. Making notes about gallery-dl is certainly a good idea (I've done that as well, just for example, because I could never remember the correct syntax for defining options in an input file), but it's way easier to put them all down in a local markdown file or something like that.

JSouthGB commented 10 months ago

TLDR: Can comment support be added please?

You could use a YAML or TOML file instead of JSON, they both support comments.

gaiking-uk commented 10 months ago

@JSouthGB: You could use a YAML or TOML file instead of JSON, they both support comments.

Ah, awesome -- TBH, I don't know yaml very well (as have only used it a little for 1-2 other apps) but I do know it supports comments so thanks for that recommendation / pointing out that option! šŸ‘šŸ¼

Are there any json --> yml converters?... I have a 240-line config.json file and given my v. limited yaml experience, I don't relish the thought of having to re-type/code it all by hand šŸ¤”

gaiking-uk commented 10 months ago

@Hrxn gallery-dl reads this, but since it cannot assign the key # to any internal identifier it understands, simply nothing happens here [...] again, there's no match for these key names so they are silently discarded.

Well, that was the core of my original issue... I thought using "#": "Foo bar" was the way to add a comment which gallery-dl would ignore this, but rather than simply discard line 107 of my config.json file ["#": "Full command: gallery-DL 'https://instagram.com/stories/foo' -D 'D:\Downloads\Instagram\foo\Stories' --mtime-from-date",] it error-ed saying it could not parse it.

Your first issue in the original comment was, that if you want to use the backslash in a string, you have to use \ because \ is the escape character in JSON, but you already found that out, and the subsequent error had nothing to do with JSON.

To be honest, I don't mind whether it's a "JSON error" or another issue, the point I was trying to make was that...

RE: Your points about not commenting out in a config file, and putting notes in a separate document

You can, as suggested, use VSCode for editing JSON files

PS - I do use VSCode for most of my scripting work (but have a different editor for a few other formats, like json)... It's ironic that you mention VSCode though, as...

JSouthGB commented 10 months ago

Are there any json --> yml converters?... I have a 240-line config.json file and given my v. limited yaml experience, I don't relish the thought of having to re-type/code it all by hand šŸ¤”

I found this one with a quick search. I'm not sure on it's effectiveness, there'll probably be a bit of finessing required.

gaiking-uk commented 10 months ago

@JSouthGB: I found this one with a quick search. I'm not sure on it's effectiveness, there'll probably be a bit of finessing required.

Cool, thanks (and didn't mean to be lazy / realise I could've just googled myself but as I have basically no yaml knowledge / experience, I wouldn't know the difference between a great one and a terrible one)... Am holding out for Mike feeling super generous and thus the "moderate-to-admittedly-slim" chance that some kind of comment-support gets added gallery-dl, but no worries if not, in that case will just have to decide on an option between...

But a problem for the future... thanks for your help šŸ‘šŸ¼

Hrxn commented 10 months ago

You can, as suggested, use VSCode for editing JSON files

PS - I do use VSCode for most of my scripting work (but have a different editor for a few other formats, like json)... It's ironic that you mention VSCode though, as...

  • VSCode does the very thing you don't approve of, and adds notes to their own json files...
  • Worse still, they do it by adding single-line javascript comments, which completely violates the json standard! šŸ¤£

That is correct, actually. Microsoft does this not only for VSCode, but in other projects as well, for example the new Terminal. They roll a custom (well, "custom", I think it's https://github.com/open-source-parsers/jsoncpp) JSON implementation with support for // style comments. Free to do so, of course, even if it's technically not standard JSON anymore. But you don't have to use these comments, obviously.

mikf commented 10 months ago

The JSON parser used by gallery-dl is the one from the Python standard library. Changing its code and adding new features would we very inconvenient, given that most if it is implemented in C and I'd rather avoid having to ship a C extension with gallery-dl. One of the main reasons for using it was to avoid extra dependencies.

The current parse is rather strict, its error messages could and should be better, etc, but it's fast (at least compared to other Python alternatives), supports types, and is available out of the box.

I would also like to add a few convenience features like trailing commas, but alas.

the potential pain of trying to create an alternative yaml config file

Load a JSON config file and have it written out by the yaml module. Here's a script: (You might have to pip install pyyaml to run it)

import os
import sys
import json
import yaml

# config file path
try:
    path = sys.argv[1]
except Exception:
    path = r"%APPDATA%\gallery-dl\config.json"

path = os.path.expandvars(os.path.expanduser(path))

# read config as JSON
with open(path) as fp:
    config = json.load(fp)

# replace filename extension with .yml
path = os.path.splitext(path)[0] + ".yml"

# write config as YAML
print("Wrting YAML config to '" + path + "'")
with open(path, "w") as fp:
    fp.write(yaml.dump(config))
Hrxn commented 10 months ago

There's also https://github.com/remarshal-project/remarshal

For easy conversion between CBOR, JSON, MessagePack, TOML, and YAML.

gaiking-uk commented 10 months ago

Cool, thanks both.

Also, thanks for the further info and explanation RE: adding comments... No worries, I had assumed you had written a custom parser for config.json (and so adding a line something like if key = "#" return hopefully wouldn't be too much to add) but if you're using an external parser module/library and aren't able to modify how this works than that's fair enough!