Publish schema of the JSON format

Genbox commented 7 years ago

Document the schema of the JSON data file and publish it.

allo- commented 7 years ago

Maybe you could try to be compatible with the https://github.com/allo-/firefox-profilemaker.

There is no real documentation of the format yet, but its mostly a list of dictionaries, where lists in several files are appended to one list. Example battery.json:

[
    {
        "name": "battery", 
        "type": "boolean", 
        "initial": true, 
        "label": "Disable the Battery API", 
        "help_text": "Firefox allows websites to read the charge level of the battery. This may be used for fingerprinting.", 
        "addons": [], 
        "config": {
            "dom.battery.enabled": false
        }
    }
]

The Profiles are a list of dictionaries, which map category-names to lists of settings/*.json files. I intend to add the option to specify different defaults there, which will probably look like "battery": false, when the setting from the json above is included but should have false as default in the profile instead of the default from the settings file.

On the other hand, if you design a sane format, I may adapt to it. What's important for me to preserv is, that the settings can be sorted into categories and have a useful description, not just a title. So the user should be able to see why he should change the setting instead of only what it does. I sometimes included links to blog posts or bugzilla entries.

Working together or at least compatible would be great, as there are A LOT of user.js files out there (see our wiki, there are many linked), but actually sorting it into readable items sorted into useful categories and rated for importance is quite a bit of work.

Genbox commented 7 years ago

It seems like you map multiple firefox settings to a single setting. Like this one where you have 2 values in config. You have 1:n and I have 1:1 mappings, which makes it quite hard to make compatible datafiles.

I think I will do as you as split the data into multiple smaller JSON files for easier maintenance and collaboration.

Edit: On a second thought, splitting it up might do more harm than good due to other factors. I will see what can be done.

allo- commented 7 years ago

Yes, I try to sort this by features (i.e. disable firefox telemetry), not by about:config settings alone. This even supports a choice field between settings and an appropriate extension.

If you concentrate on one-to-one and settings, this could be at least compatible with my data files which contain just one setting.

I think when you specify a mostly compatible json format (setting, title, helptext, if possible setting in the appropriate type number/string/boolean) I will at least look into a python script, which can convert between both formats. And yeah, i need to write a specification as well. But there are some infrastructure details (see the label in the bugtracker) which need to be decided first.

Btw. you will also need to have a look at writing extension settings, which are no longer in about:config, see the files_inline function and the files_inline format in the json file. I think with WebExtensions firefox will disallow extensions to use about:config preferences at all.

nodiscc commented 7 years ago

What's important for me to preserv is, that the settings can be sorted into categories and have a useful description, not just a title

In https://github.com/pyllyukko/user.js/blob/master/user.js we tried to achieve that by, in each preference's PREF: header, stating both what the pref does and what the purpose of the change is. Example:

// PREF: Disable WebRTC entirely to prevent leaking internal IP addresses

Not all prefs currently fill this criteria. For example // PREF: Disable Service Workers does not state why (actually below you can read Unknown security implication, so this change is about general precaution principle, reducing attack surface). So this could be improved.

In addition to what @pyllyukko said in https://github.com/pyllyukko/user.js/issues/25#issuecomment-308173550 (required: category/section, description, setting, value, type) it would be nice to have some warning indicator when hardening a setting is known to cause problems. We have NOTICE fields such as // NOTICE: Disabling WebRTC breaks peer-to-peer file sharing tools (reep.io ...), it would be nice if to have a ⚠ symbol showing potential problems in a tooltip on hover next to affected prefs.

nodiscc commented 7 years ago

there are A LOT of user.js files out there (see our wiki, there are many linked),

Yes this is actually important, I don't know if this is the case yet but your tool should allow multiple profiles/presets that cater to different user needs. For example https://github.com/nodiscc/user.js/tree/relaxed is a relaxed variant of pyllyukko's user.js, closer to what I use myself.

On a second though, splitting it up might do more harm than good due to other factors. I will see what can be done.

This is one of the only cases where I think multiple JSON files would be good (unless you want to stash ALL presets in a single file - but it could get huge)

allo- commented 7 years ago

we tried to achieve that by, in each preference's PREF: header, stating both what the pref does and what the purpose of the change is

That's what i require for stuff i am adding. For people to trust a profile generated for them, they need to understand it. They can still just click "next" on each page, but they have at least the chance for a informed decision on each setting.

or example // PREF: Disable Service Workers does not state why (actually below you can read Unknown security implication

That's something I avoid in the default profile. First as far as i know there is no security implication, but a performance gain with service workers. Second it is cluttering the preferences with things the user doesn't understand and I am not having a source or rationale why it should be added.

Another thing for the default profile is, that I do not add unneccessary things. You may have noticed, that I do not disable geolocation. I do not do so, because firefox asks you each time a website requests it. And there is a UI exposed in "Page Info" to change the permissions. So there is no security gain by doing so and it's taking away the option to whitelist pages, which is offered by the UI.

On the other hand, there should be a paranoid profile, which defaults to disable geolocation, such that the user is not even asked. Because when you're for example configuring a browser to use tor, dialogs asking for geolocation are an annoyance when you know why to click no and a security risk if you do not know it.

Yes this is actually important, I don't know if this is the case yet but your tool should allow multiple profiles/presets that cater to different user needs.

See https://github.com/allo-/firefox-profilemaker/tree/master/profiles My plans for additional profiles:

paranoid (add settings, which aren't important enough to "clutter" the default profile and set the defaults for all settings to "no")
classic theme (could be obsolete with firefox 57 :-()
Let the user upload a profile as json and possibly download a json file with the presets he created when using the site.

The current "private browsing" profile is more or less a proof of concept (a subset of options and the only additional option is "private_browsing_autostart". I need more time and/or help to add more settings and profiles.

I guess i can parse your comment syntax in candidates for settings, which contain the short description, the setting, the value and probably the autodetected value type. I think i still need to add a help_text for many and sort them. This is not only sorting into the categories, but especially deciding if they are worth to go into a profile. The other part of sorting is merging things. Like i tried to merge different referer settings into a choice box, because they only make sense in specific combinations.

I believe that i should not expect the user too much to read hundereds of settings. Then he will either give up creating the profile or just accept defaults, which means he could just have copied an user.js from the web trusting it to have good defaults.

Again a use case for profiles and possibly the option to combine mulitple profiles. Then there could be an "all" profile, the default profile, the paranoid profile, an profile which disables all the shiny new stuff, etc.

My current steps are mostly collecting new settings i want to include in the bugtracker and adding them from time to time. When I have enough time again I will create a syntax for presets in the profiles and try to provide more profiles from existing settings files.

nodiscc commented 7 years ago

That's something I avoid in the default profile. First as far as i know there is no security implication, but a performance gain with service workers. Second it is cluttering the preferences with things the user doesn't understand

I agree with that and that's why the default/master branch of https://github.com/pyllyukko/user.js can be considered a "paranoid" profile.

A more usable variant is the relaxed branch which doesn't have the extra precautions, restores the fully opt-in/apparently harmless features (it's not complete yet, but there are some pull requests pending for that - if you'd like to send more patches I think they would be welcome).

differences between 2 variants

Additions to the master branch are merged to the relaxed branch on a regular basis.

I think i still need to add a help_text for many and sort them

Well we could also help you with that, for example if you send a short patch against master it could illustrate what you need, I'd be glad to help making more settings understandable.

allo- commented 7 years ago

I agree with that and that's why the default/master branch of https://github.com/pyllyukko/user.js can be considered a "paranoid" profile.

That's fine with me. I prefer quite a few settings that way, but i try to recognize, that many people will not use profiles, which aren't useful in the modern web. i.e. blocking indexeddb breaks twitter. So you need something like the volatilestorage addon instead or people will just use an unprotected browser. Profile will be a nice way to give both groups useful defaults for the same settings.

Well we could also help you with that, for example if you send a short patch against master it could illustrate what you need, I'd be glad to help making more settings understandable.

I will look into trying to specify this a bit more strikt instead of just using examples. If you want to you can browse through the json files and try to get an impression. Mostly I try to have a short description what the setting does (i.e. "Disable phishing protection") and then a short text, why you should do it (i.e. "The phishing protection contacts google with an unique key: wrkey"). Either the text or an link included in the text should be a basis for the user to understand the setting, without being "too long didn't read".

When you're creating a json format as well, we could try to have one which fits both projects or at least allows to convert between them for the fields the other project needs. For the text based, I could write a simple parser. It won't be a fully automatic process anyway, someone has to check and sort the settings.

Genbox commented 7 years ago

We are kinda going out on a tangent here, but it's fine. Very good points brought up by both @allo- and @nodiscc

As for not breaking stuff, what about going down the route nLite have, where you choose the major features you would like to have working, and then you can't change the settings which have been locked by your "compatibility" list?

Such a profile would point to specific settings and simply freeze them in their default state. It is the same thing @allo- have done with the profiles, we just freeze the setting instead.

allo- commented 7 years ago

That's a good idea. I would like to add tags to the settings "breaks: ['animations', 'one-page-websites']" an "enables: ['cookie-deletion', 'reduce-fingerprint']", possibly with adding priority i.e. disabling cookies at all has maximum priority for breaking logins, while reducing cookie lifetime has low priority, thus only changing the default of the setting instead of removing the setting.

This could be used to generate presets by selecting which features are allowed to break. Then It would remove some settings, change the default on other ones. This means, i would currently like to tag my settings, implementing the feature wizard is for the future ;-).

nodiscc commented 7 years ago

We are kinda going out on a tangent here

Sorry, it has helped me getting a clearer picture of what we would need. @Genbox would the example below be ok?

{
  "profile": "pyllyukko/user.js (paranoid)",
  "prefs": {
    "dom.event.clipboardevents.enabled": {
      "category": "HTML5 / APIs / DOM",
      "type": "boolean",
      "description": "Disable clipboard event detection (onCut/onCopy/onPaste) via Javascript",
      "active_when": "false",
      "default": "false",
      "warnings": [
        "Disabling clipboard events breaks Ctrl+C/X/V copy/cut/paste functionaility in JS-based web applications (Google Docs...)"
        ],
      "references": [
        "https://developer.mozilla.org/en-US/docs/Mozilla/Preferences/Preference_reference/dom.event.clipboardevents.enabled"
        ]
    },
    "network.http.referer.spoofSource": {
      "category": "HTTP",
      "type": "boolean",
      "description": "Send a referer header with the target URI as the source",
      "active_when": "true",
      "default": "true",
      "warnings": [
        "Spoofing referers breaks functionality on websites relying on authentic referer headers",
        "Spoofing referers breaks visualisation of 3rd-party sites on the Lightbeam addon",
        "Spoofing referers disables CSRF protection on some login pages not implementing origin-header/cookie+token based CSRF protection"
        ],
      "references": [
        "https://bugzilla.mozilla.org/show_bug.cgi?id=822869",
        "https://github.com/pyllyukko/user.js/issues/227"
        ]
    },
  },
}

All this information can be found in https://github.com/pyllyukko/user.js/blob/master/user.js. I did this by hand but automated conversion is definitely possible if you agree on this schema.

allo- commented 7 years ago

Looks good, but some comments:

"when_active: false" inverts the logic and assumes boolean options. Why not "when_active: "foo.bar=false"?

References should have tuples (title, URL), so nice links can be generated.

Category should contain a category ID (i.e. html5-dom), not free text. This makes translation of the titles easier and avoids having the wrong category in the settings because of a missing space.

Genbox commented 7 years ago

@nodiscc: Looks good. I also agree with @allo-'s comments. If we are going full crazy here, we might as well look at translations.

allo- commented 7 years ago

As i currently call the translation function in the python code, it lacks a function extracting the strings, since i moved the settings into the json. This means i need sooner or later a (python) tool to extract strings from settings files anyway. This means https://github.com/allo-/firefox-profilemaker/issues/74 depends on the finalized format and will generate a gettext ".po" file for translation with standard tools.

Genbox commented 7 years ago

@allo-

For the profiles, it might be easier to make the decoupled from the data. That is, have a separate JSON file that describes settings by their name and what value they should have. Example using same layout as @nodiscc

{
    "preset": "HTML5 support",
    "prefs": {
        "dom.event.clipboardevents.enabled": {
            "value": "true",
            "type": "freezeValue",
            "reason": "Breaks functionality ..."
        },
        "some.other.setting": {
            "value": "9",
            "type": "limitRange",
            "maxValue": "3",
            "minValue": "0",
            "reason": "..."
        }
    }
}

nodiscc commented 7 years ago

I have created an example file at https://github.com/nodiscc/userjs-schema to help define an example schema, and given both of you push access, please add any changes you would like to see (it will probably be easier this way - @allo- I get your point about category ID and link tuples, not sure I understand the when_active logic you would want. @Genbox feel free to add an example translation)

Genbox commented 7 years ago

Thanks @nodiscc

Genbox commented 7 years ago

Not sure how to do the translations. Might be best to have localized files such as en-US.json that contains the descriptions, warnings etc.

As long as the design accommodates localization as it does now, it should be easy to add later on when the rest is in place. For now, we should focus at functionality.

allo- commented 7 years ago

You should really use Gettext, as there are many programs used by translators, which support this. I use for example mostly gtranslate, I think some Ubuntu projects have some webbased program. The nice thing about gettext is, that the usual generators comment out the old version of changed entries and the translation programs still read them and suggest them as base to update the text for the new translation.

But converting between basic gettext and json will be no problem either. It just depends on how the strings are extracted. I had them extracted from the usual python-gettext extractor, but did not implement a new method after moving the settings from python code to json files, yet.

Genbox commented 7 years ago

As for the schema, any pros/cons when it comes to a hierarchical structure for the categories? currently, my datafile is split into categories and settings, instead of a property on each setting:

{
    "Name": "API and DOM",
    "Categories": [{
            "Name": "HTML5",
            "Settings": [{
                    "Name": "browser.send_pings",
                },
            ]
        }
    ]
}

I know that the example @nodiscc is limited by the structure of the pyllyukko/user.js file, so maybe it is just a matter of converting the file into the flat format proposed in https://github.com/nodiscc/userjs-schema/blob/master/example.json and then organize it into the hierarchical data format.

allo- commented 7 years ago

I think we should not use the Category to identify the setting. If you for example implement profiles, you may have on one the Categories "Browser" and "Web" and on the other one the Categories "Basic" and "Paranoid". In both you want to include the same settings snippets, but sort them into different categories.

allo- commented 7 years ago

I think this should have mostly "foreign keys", like "Category = [setting1_id, settings2_id, setting4_id]". This makes it easier to reorder the items, possibly even dynamically (you had the idea of i.e. filtering by (broken) features), which sounds great.

Genbox commented 7 years ago

@allo-, I'm not sure what you mean. I aim to keep it pretty simple, and any changes to the UI such as wildly different categories would be confusing for a user.

allo- commented 7 years ago

Lets assume, i want to have two profiles, dynamically or statically generated from single json files with setting snippets (i.e. what i am currently doing ;-)).

Then you may have one profile, which categories "send_pings" technically as "API and DOM" and another one, which sorts it into "Dangerous" (i.e. for tor users). Another one would have it into "Phoning Home", etc.

So i think this should be kept more flat with references like in a database. Then you have settings, categories and profiles as separate lists, which reference their children. Then a category would map to a list of settings-ids, which can be used to get the setting data from the settings-list.

in a simplified version:

"categories": {
    "category1": {
        "name": "displayed name",
        "settings": ["setting1", "setting2"]
    }
}
"settings": [
{ ...}
]
"profiles":
    "profile1": ["category1", "category2"]
}

Or like i am doing, because i think you do not need to reuse whole category lists (and if you need to, copy&paste is sufficient"

"settings": [
{ ...}
]
"profiles":
    "profile1": [
        "category1": {
            "name": "displayed name",
            "settings": ["setting1", "setting2"]
        }
    ]
}

So i can just add a second profile with the same settings and different sorting:

"profiles":
    "all on one page profile": [
        "category1": {
            "name": "displayed name",
            "settings": ["setting1", "setting2"]
        }
    ],
    "many categories profile": [
        "category1": {
            "name": "displayed name",
            "settings": ["setting1"]
        },
        "category2": {
            "name": "another category nam,e",
            "settings": ["setting2"]
        }
    ]
}

Genbox / HardenIT

Publish schema of the JSON format #2