Complex rules need a 2nd pass traversal or at least callback after traversal finished

adroste commented 3 years ago

Is your feature request related to a problem? Please describe

Sometimes a rule is too complex to be calculated in a single pass. With workarounds it is often possible to collect and cache relevant elements/structures during a single traversal pass, but then a callback after traversal has finished is needed.

Describe the solution you'd like

Ability to hook into a 2nd traversal pass. This would be two lines inside lib/test-rule.js:

module.exports = function testRule({ moddleRoot, rule }) {
  const reporter = new Reporter({ rule, moddleRoot });
  traverse(moddleRoot, node => rule.check(node, reporter));
  if (rule.check2)
    traverse(moddleRoot, node => rule.check2(node, reporter));
  return reporter.messages;
};

Describe alternatives you've considered

Provide a callback after traversal finished. Would also be two lines inside lib/test-rule.js:

module.exports = function testRule({ moddleRoot, rule }) {
  const reporter = new Reporter({ rule, moddleRoot });
  traverse(moddleRoot, node => rule.check(node, reporter));
  if (rule.onFinish)
    rule.onFinish(node, reporter)
  return reporter.messages;
};

Possible workarounds right now

Override the messages property on the reporter object to be a getter function.

    factory() {

        function onFinish(node, reporter) {
            reporter.report("TestID", "Test");
        }

        function check(node, reporter) {

            if (!reporter._messages) {
                reporter._messages = [];
                reporter.report = (id, message) => {
                    reporter._messages.push({ id, message });
                }
                Object.defineProperty(reporter, 'messages', {
                    get: () => {
                        onFinish(node, reporter);
                        return reporter._messages;
                    },
                });
            }

        }

        return { check };
    }

This is obviously not a pretty solution.

nikku commented 3 years ago

Sounds like an interesting feature request at hand. Could you be a little bit more concrete with what you consider a complex rule?

We're happy to improve the existing APIs. Exposing an additional check as you proposed could be a sensible improvement. To do that we'd need to understand the concrete problem at hand though.

adroste commented 3 years ago

I have a lot of custom rules that follow the scheme:

collect relevant elements during traversal
perform checks after every relevant element in tree got visited

A few examples requiring an `afterCheck` callback

1. Check if DataObjectReference has input/output associations

```js export const dataInputOuputAssocationsRequired = () => ({ category: RuleCategoryEnum.ERROR, factory(binding) { const dataInputAssociations = []; const dataOutputAssociations = []; const dataObjectReferences = []; function afterCheck(node, reporter) { dataObjectReferences.forEach(({ id }) => { const receivesInput = dataInputAssociations.some( ({ sourceRef }) => sourceRef[0]?.id === id); const receivesOutput = dataOutputAssociations.some( ({ targetRef }) => targetRef?.id === id); if (!receivesInput) reporter.report( id, 'Ausgehende Datenassoziation fehlt' ); if (!receivesOutput) reporter.report( id, 'Eingehende Datenassoziation fehlt' ); }); } function check(node, reporter) { if (is(node, 'bpmn:DataInputAssociation')) { dataInputAssociations.push(node); } else if (is(node, 'bpmn:DataOutputAssociation')) { dataOutputAssociations.push(node); } else if (is(node, 'bpmn:DataObjectReference')) { dataObjectReferences.push(node); } } return { check, afterCheck }; } }); ```

2. Check if custom extension "Assets" are unused - comparison: with/without callback

**Without `afterCheck` callback** ```js export const noUnusedAssets = () => ({ category: RuleCategoryEnum.WARN, factory(binding) { const assets = []; const refs = []; function updateReport(reporter) { const unused = assets.filter(({ id }) => refs.every(({ refId }) => refId !== id)); reporter.messages = []; if (unused.length) { unused.forEach(asset => { reporter.report( asset.id, `Asset "${asset.name}" ist unbenutzt.` ); }); } } function check(node, reporter) { if (is(node, 'pb:Assets')) { assets.push(...node.get('assets')); updateReport(reporter); return; } if (is(node, 'pb:AssetRef')) { refs.push(node); updateReport(reporter); return; } } return { check }; } }); ``` **With `afterCheck` callback** ```js export const noUnusedAssets = () => ({ category: RuleCategoryEnum.WARN, factory(binding) { const assets = []; const refs = []; function afterCheck(node, reporter) { const unused = assets.filter(({ id }) => refs.every(({ refId }) => refId !== id)); unused.forEach(asset => { reporter.report( asset.id, `Asset "${asset.name}" ist unbenutzt.` ); }); } function check(node, reporter) { if (is(node, 'pb:Assets')) { assets.push(...node.get('assets')); } else if (is(node, 'pb:AssetRef')) { refs.push(node); } } return { check, afterCheck }; } }); ```

Example description that requires a 2nd traversal pass

3. Check if DataInput/Output is consumed by contained element

I have a custom extension for tasks/events that requires DataInput/Output (via Associations) to be mapped or consumed. Mapping happens inside nested extension elements. In order to check for unused data input/output or correct mappings I have to find all visible base elements (Events, Activities, ...) that are source or target of the DataAssociations and then traverse all nested inner elements of them to find the mappings. This can be done two ways 1. Either find the base elements and then traverse all inner elements of them during check (requires access to traversal algorithm) 2. Find and cache the base elements, run 2nd traversal pass and check mappings (custom extension elements) on the fly

Conclusion

An afterCheck callback is not necessary if you have a 2nd traversal pass hook. This could be combined with an early return/abort of the dfs traversal algorithm. See #53

nikku commented 3 years ago

https://github.com/bpmn-io/bpmnlint/pull/54 implements a simple extension to the given check.

A rule may now be explicit about whether it would like to check a node on enter, leave, or both:

module.exports = function someRule() {

  return {
    check: {
      enter: function(node, reporter) { ... },
      leave: function(node, reporter) { ... }
    }
  };
}

adroste commented 3 years ago

Works for me, see https://github.com/bpmn-io/bpmnlint/pull/54#issuecomment-752307743

bpmn-io / bpmnlint