AdguardTeam / tsurlfilter

AdGuard content blocking library
GNU General Public License v3.0
50 stars 13 forks source link

Compatibility table, converter #90

Open scripthunter7 opened 1 year ago

scripthunter7 commented 1 year ago

We should detect incompatible rules during the linting process and offer converted rules as fixes, if possible. In order to do this, we need a compatibility table / converter. Theoretically, this could be implemented directly within linter rules, but a standalone API would be more convenient.

This API should be able to:

Points worth considering:

Please note that this compatibility table is a quite bit different from the one in the Scriptlets library, since it must be able to convert from any syntax to any syntax, and it must also handle selectors, HTML filtering rules, etc.

@ameshkov I opened this issue so that we can schedule this as part of future developments. What do you think?

ameshkov commented 1 year ago

I think a compatibility table will be VERY useful, but in order to make it "maintainable", we need filters maintainers expertise here.

The idea is that we should have a table in the repo that filters maintainers can contribute to. This compatibility table should be filled incrementally, at first we may cover only a part of rules.

scripthunter7 commented 1 year ago

@ameshkov I haven't specified the details yet, but my idea is to store the data in a TS object / JSON file / YML file. TS/YML file would be best as they allow comments. I think these are clear and easy-to-manage formats.

Of course, we will add a documentation that describes exactly this structure.

ameshkov commented 1 year ago

Makes perfect sense to me

scripthunter7 commented 1 year ago

I thought about compatibility tables, here is a draft of what I have in mind.

First of all, there is an important point to keep in mind: we need a maintainable, convenient data structure. Even if it needs to be converted later by a build to be "optimal" for good performance, it should be easy to maintain.

What I was thinking about is the following:

What should be detailed in the records?

The fields may change depending on the individual categories, so it is advisable to put them in separate files.

Here is a draft of what the modifiers.yaml file could look like:

adg_for_windows:
- name: script
  params: false
  deprecated: false
  version_added: '1.0'
  docs: https://kb.adguard.com/en/general/how-to-create-your-own-ad-filters#script-modifier
  description: The rule corresponds to script requests, e.g. javascript, vbscript.

or modifiers.json:

{
  "adg_for_windows": [
    {
      "name": "script",
      "params": false,
      "deprecated": false,
      "version_added": "1.0",
      "docs": "https://kb.adguard.com/en/general/how-to-create-your-own-ad-filters#script-modifier",
      "description": "The rule corresponds to script requests, e.g. javascript, vbscript."
    }
  ]
}

Points to consider:

ameshkov commented 1 year ago

Sounds good to me. I think we should start with something, if we realize that there are any flaws in the chosen structure, converting it to a different one would not be too complicated compared to filling out the table.

scripthunter7 commented 1 year ago

I think we should start with something, if we realize that there are any flaws in the chosen structure, converting it to a different one would not be too complicated compared to filling out the table.

Yes, a scope of this size probably cannot be covered perfectly at first, but we will see the edge cases in time.