node-red / nrlint

Node-RED Flow Linter
Apache License 2.0
36 stars 15 forks source link

Refactor Linter Rule API to use WebWorkers #6

Closed knolleary closed 3 years ago

knolleary commented 3 years ago

This is quite a large refactor that covers a number of items to be considered. I'm sorry about the size of this.

1. Moved Flow Manipulation API to its own npm module

I have moved the api to @node-red/flow-parser. I propose to eventually move it to its own repository. This allows us to require the API in both the CLI and WebWorker version of the API, rather than have to maintain two copies of it.

I have some separate feedback on some of the behaviour of the API - for example, the way it handles subflows and that it doesn't know about groups.

2. Restructure rules

I am proposing a slightly different file structure for modules that provide rules, as well as a way for rules to share metadata such as available configuration options. This is very influenced by eslint rules.

For example, nrlint-plugin-core now looks like this:

├── lib
│   ├── index.js
│   ├── plugin.html
│   └── rules
│       ├── avoid-loops.js
│       ├── flow-size.js
│       ├── http-in-response.js
│       └── named-functions.js
├── package.json
├── resources
│   └── nrlint-plugin-core-worker.js
├── test
│   └── rule_spec.js
└── webpack.config.js
  1. each rule is in its own file under lib/rules.

  2. lib/index.js is the main entry point to the module. It re-exports all of the rules:

    module.exports = {
        "avoid-loops": require("./rules/avoid-loops"),
        "flow-size": require("./rules/flow-size"),
        "http-in-response": require("./rules/http-in-response"),
        "named-functions": require("./rules/named-functions")
    };
  3. Each rule module should export an object as follows:

    module.exports = {
    
        meta: {
            type: "suggestion",
            severity: "warn",
            docs: {
                description: "limit the number of nodes on any one flow"
            },
            schema: [
                {
                    type: "object",
                    properties: {
                        maxSize: { type: "number" }
                    }
                }
            ]
        },
        check: checkFlowSize
    };
    • meta.type - what sort of rule is this. eslint has suggestion problem and layout
    • meta.severity - the default severity this rule reports at (can be overriden in config)
    • meta.docs.description - one line description of the rule
    • meta.schema - a JSON Schema of the valid configuration options for this rule. Will can use this to auto generate a settings panel in the editor (future feature).
    • check - the function that will get called to run the rule
  4. The check function behaviour is slightly different.

    • It still receives the three arguments - flow, config, context.
    • It does not return anything.
    • When it needs to report a failure, it calls context.report - this is the same model used by eslint.
          cxt.report({
              location: [flow.id],
              message: "flow too large"
          })
      • location - an array of ids of the affected object (they were ids in the previous version)
      • message - a message to display. (future: as with eslint, this could be a message-identifier to points to a message in the meta object, allowing for i18n).
      • severity - optional property of warn or error to override the config
  5. The lib/plugin.html file is the plugin file loaded by the editor. It contains:

    <script>
    RED.plugins.registerPlugin('nrlint-plugin-core', {
        type: 'nrlint-rule',
        worker: 'nrlint/nrlint-plugin-core-worker.js'
    });
    </script>
    • The main item here is the worker property that names the path to the worker file for this plugin.
  6. Each plugin has a webpack.config.js

    const path = require('path');
    
    module.exports = {
        entry: path.resolve(__dirname, './lib/index.js'),
        output: {
            filename: 'nrlint-plugin-core-worker.js',
            path: path.resolve(__dirname, 'resources'),
            library: {
                name: ['nrlint','rules'],
                type: 'assign-properties'
            }
        },
        mode: 'production',
        target: ['webworker']
    };
    • it packs lib/index.js to resources/nrlint-plugin-core-worker.js. The name of the output is not important as long as it matches the worker property in plugin.html
    • The library configuration means all of the rules exported by lib/index,js get added to the global variable nrlint.rules.
  7. webpack should be a devDepdendency - part of the npm release process will involve running the build and publishing the version with the pre-build resources files in.

3. nrlint-main

I have changed the main linter.lint function to not be async - as we don't run it as async in the editor anyway. If we need the ability to have async rules, we should think about how to introduce that back in cleanly.

4. Configuration format changes

I'm proposing a slightly different format to the nrlintrc.js file.

module.exports = {
    plugins: [
        "nrlint-plugin-core",
        "nrlint-plugin-func-style-eslint"
    ],
    rules: {
        "avoid-loops": true,
        "flow-size": { maxSize: 6 },
        "function-eslint": {
            parserOptions: {
                ecmaVersion: 6
            },
            rules: {
                semi: 2
            }
        },
        "http-in-response": true,
        "named-functions": true
    }
};

I think this flatter structure will be easier for the user to not have to worry about which rule is provided by which module.

5. package.json changes

I've removed the prepare tasks as I think the webpack stage is a build-time action before being published to npm.

I've added scripts to build each of the modules and copy their output to the top level /resources directory. This is helpful at development time, but ultimately I think that will have to change so you npm install each sub module individually, rather than npm install the top level module into NR.

6. TODO: None of the tests work

As I write this now, I realise I haven't updated any of the tests to work with these changes.

Next Steps

If there are any of these changes you disagree with, then please do say. I may have got slightly carried away here and this turned into a bigger refactor than I intended.

If you are happy with these proposals, the next steps will be:

  1. move @node-red/flow-parser to its own repository
    1. review the handling of subflows. For example, should a call to next() go into the subflow, or step over the subflow.
    2. add support for groups
  2. Fix the tests
  3. Add some scripts to automate build/publish to npm to help manage releases
  4. Add a CHANGELOG.md
  5. Update the README to properly reflect all of these changes
  6. Come up with some more rules to include in the core set!
  7. UI Updates - there are some updates needed to the linter sidebar which I'll propose separately.
knolleary commented 3 years ago

I have spent some time looking at the Flow Manipulation API and how the current rules work.

I have also spent some time looking more at how eslint rules work.

Lots of our rules want to check nodes of a particular type - such as the Function node, or HTTP In node. At the moment, every rule has to iterate over the complete Node list to find the nodes it cares about. I think we can do better.

With eslint, a rule can register a function for certain events during the parsing. For example, it can register a function to be called for every if... statement.

I suggest we implement a similar approach.

The check property is passed the config and context objects. It returns an object that provides functions that should be called by the linter based on the key value in the object.

 Example - named-function

module.exports = {
    meta: {
        type: "suggestion",
        severity: "warn",
        docs: {
            description: "ensure all Function nodes have a name"
        },
        schema: []
    },
    check: function(context, ruleConfig) {
        return {
            "type:function": function(node) {
                if (!!!node.config.name) {
                    context.report({
                        location: [node.id],
                        message: "Function node has no name"
                    })
                }
            }
        }
    }
};

 Example - unique-http-in-paths

This example checks that all HTTP In nodes have unique paths. It uses the start function to initialise its own state, the type:http in to visit each node and capture its url and the end function to check for any with more than one node using that url.

module.exports = {
    meta: { ... },
    check: function(context, ruleConfig) {
        let seenPaths;
        return {
            "start": function(flowConfig) {
                seenPaths = {};
            }
            "type:http in": function(node) {
                if (node.config.url) {
                    seenPaths[node.config.url] = seenPaths[node.config.url] || [];
                    seenPaths[node.config.url].push(node.id);
                }
            }
            "end": function(flowConfig) {
                for (var url in seenPaths) {
                    if (seenPaths[url].length > 1) {
                        seenPaths[url].forEach(id => {
                            context.report({
                                location: [id],
                                message: "Duplicate HTTP In url"
                            })
                        })
                    }
                }
            }
        }
    }
};

The key benefit here is the linter can do one iteration over all of the objects and call the lint rules as needed - this will be much more efficient than each rule searching for itself.

For the rules that need to see the 'big' picture, they can use the start or end event with full access to the flow configuration.

I understand I am presenting a lot of changes here - but I hope you can see the value they will bring.

I will raise a new PR tomorrow (that builds on top of this PR) that implements this suggestion so you can see it in action.

k-toumura commented 3 years ago

Thank you for creating the PR. I think this is a great improvement. I'm taking a look at it little by little...

Regarding "5. package.json changes": nrlint-plugin-func-style-eslint uses eslint4b package, and the package should installed before webpack is called. Should we add some installation process on build-func-style-eslint-rules script?

knolleary commented 3 years ago

I think we should consider splitting this one repository into three and move away from having multiple modules in here.

  1. nrlint -> contains the core of the linter including all the 'core' rules as a single module.
  2. nrlint-plugin-func-style-eslint -> because of its larger size
  3. @node-red/flow-parser

Having said that, I am wondering if splitting func-eslint out is a good idea or not. Once bundled, it isn't as big as I thought it would be. It might be cleaner to keep all of our rules the nrlint bundle.

But regardless of that, to answer your question, yes will want to have build script that takes the source and creates the files that get npm published. End users should not need webpack installed if they are just installing and using the linter from npm. Webpack should only be a devDependency for people who want to help develop the rule.