teemtee / fmf

Flexible Metadata Format
GNU General Public License v2.0
22 stars 28 forks source link

Cannot use parenthesis in `--filter` regexp #244

Closed kkaarreell closed 5 months ago

kkaarreell commented 10 months ago
In https://github.com/RedHat-SP-Security/keylime-tests
$ tmt --root . -c arch=x86_64 -c distro=rhel-9 test ls --filter 'name: (/setup|/functional).*'
        warn: /setup/generate_upstream_rust_keylime_code_coverage:description - None is not of type 'string'
        warn: /setup/install_etc_keylime_conf:recommend - None is not valid under any of the given schemas

missing ), unterminated subpattern at position 1

But

$ tmt --root . -c arch=x86_64 -c distro=rhel-9 test ls --filter 'name: /functional.*'
        warn: /setup/generate_upstream_rust_keylime_code_coverage:description - None is not of type 'string'
        warn: /setup/install_etc_keylime_conf:recommend - None is not valid under any of the given schemas
/functional/agent_UUID_assignment_options
/functional/basic-attestation-on-localhost
/functional/basic-attestation-with-concatenated-certificates
/functional/basic-attestation-with-custom-certificates
/functional/basic-attestation-with-ima-signatures
/functional/basic-attestation-with-unpriviledged-agent
/functional/basic-attestation-without-mtls
/functional/db-mariadb-sanity-on-localhost
/functional/db-mysql-sanity-on-localhost
/functional/db-postgresql-sanity-on-localhost
/functional/durable-attestion-sanity-on-localhost
/functional/ek-cert-use-ek_check_script
/functional/ek-cert-use-ek_handle-custom-ca_certs
/functional/install-rpm-with-ima-signature
/functional/keylime-non-default-ports
/functional/keylime_create_policy-static-data
/functional/keylime_tenant-commands-on-localhost
/functional/keylime_tenant-ima-signature-sanity
/functional/measured-boot-policy-sanity
/functional/measured-boot-swtpm-sanity
/functional/service-logfiles-logging
/functional/tenant-runtime-policy-sanity
/functional/tpm-issuer-cert-using-ecc
/functional/tpm_policy-sanity-on-localhost
/functional/use-multiple-ima-sign-verification-keys

$ tmt --root . -c arch=x86_64 -c distro=rhel-9 test ls --filter 'name: /setup.*'
        warn: /setup/generate_upstream_rust_keylime_code_coverage:description - None is not of type 'string'
        warn: /setup/install_etc_keylime_conf:recommend - None is not valid under any of the given schemas
/setup/configure_default_revocation_notifier/agent
/setup/configure_default_revocation_notifier/zeromq
/setup/configure_kernel_ima_module/ima_policy_signing
/setup/configure_kernel_ima_module/ima_policy_simple
/setup/configure_swtpm_device
/setup/configure_tpm_emulator
/setup/enable_keylime_coverage
/setup/enable_keylime_debug_messages
/setup/generate_coverage_report
/setup/generate_upstream_rust_keylime_code_coverage
/setup/inject_SELinux_AVC_check
/setup/install_etc_keylime_conf
/setup/install_rust_keylime_from_copr
/setup/install_upstream_keylime
/setup/install_upstream_rust_keylime
/setup/switch_git_branch
The-Mule commented 6 months ago

IMO this is simply how fmf.filter works. Although it is not precisely documented it expects DNF expression without parentheses (but respecting the correct order of precedence). The right expression would be:

❯ tmt --root . -c arch=x86_64 -c distro=rhel-9 test ls --filter 'name: /setup.* | name:/functional.*'
        warn: /setup/generate_upstream_rust_keylime_code_coverage:description - None is not of type 'string'
        warn: /setup/install_etc_keylime_conf:recommend - None is not valid under any of the given schemas
/functional/agent_UUID_assignment_options
/functional/basic-attestation-on-localhost
/functional/basic-attestation-with-concatenated-certificates
/functional/basic-attestation-with-custom-certificates
/functional/basic-attestation-with-ima-signatures
/functional/basic-attestation-with-unpriviledged-agent
/functional/basic-attestation-without-mtls
/functional/db-mariadb-sanity-on-localhost
/functional/db-mysql-sanity-on-localhost
/functional/db-postgresql-sanity-on-localhost
/functional/durable-attestion-sanity-on-localhost
/functional/ek-cert-use-ek_check_script
/functional/ek-cert-use-ek_handle-custom-ca_certs
/functional/iak-idevid-register-with-certificates
/functional/install-rpm-with-ima-signature
/functional/keylime-non-default-ports
/functional/keylime_create_policy-static-data
/functional/keylime_tenant-commands-on-localhost
/functional/keylime_tenant-ima-signature-sanity
/functional/measured-boot-policy-sanity
/functional/measured-boot-swtpm-sanity
/functional/service-logfiles-logging
/functional/tenant-runtime-policy-sanity
/functional/tpm-issuer-cert-using-ecc
/functional/tpm_policy-sanity-on-localhost
/functional/use-multiple-ima-sign-verification-keys
/setup/configure_default_revocation_notifier/agent
/setup/configure_default_revocation_notifier/zeromq
/setup/configure_kernel_ima_module/ima_policy_signing
/setup/configure_kernel_ima_module/ima_policy_simple
/setup/configure_swtpm_device
/setup/configure_tpm_emulator
/setup/enable_keylime_coverage
/setup/enable_keylime_debug_messages
/setup/generate_coverage_report
/setup/generate_upstream_rust_keylime_code_coverage
/setup/inject_SELinux_AVC_check
/setup/install_etc_keylime_conf
/setup/install_rust_keylime_from_copr
/setup/install_upstream_keylime
/setup/install_upstream_rust_keylime
/setup/switch_git_branch

So this is not really a bug in the filter but rather the documentation bug that confuses people. This becomes more complicated when one tries to write filters for expressions such as (L1 AND L2) | L3 (where Ln is a literal in TMT language, e.g. name: /setup.*. FMF filter corresponding to this expression is L1 & L2 | L3 (because & takes precedence (it is confusing without parentheses, right?)

The pain is more obvious when your human-readable form of filter tends to fit more into CNF, e.g. (L1 OR L2) AND L3 because to put this into FMF filter you needs write it int DNF (again without parentheses) L1 & L3 | L2 & L3. To avoid this it might be preferable to specify filter as a list of expressions - filter needs to match all elements of the list, e.g filter: ["L1 | L2", "L3"] (notice that tmt plan does not allow multiple filter attributes while I think commands line `--filter does).

All in all, this is not a bug. It is rather documentation bug. It says:

Help on function filter in fmf:

fmf.filter = filter(filter, data, sensitive=True, regexp=False)
    Return true if provided filter matches given dictionary of values

    Filter supports disjunctive normal form with '|' used for OR, '&'
    for AND and '-' for negation...

and it should perhaps be more explicit about use of parentheses:

    Filter requires disjunctive normal form with '|' used for OR, '&'
    for AND and '-' for negation. Although parentheses cannot be 
    used the operator precedence is respected (first -, then & and | last)...

Perhaps some examples can be added too.

The-Mule commented 6 months ago

Ah wait, you are talking about regexp, hm, yes, then it is really a bug. Sorry :).

psss commented 6 months ago

I believe the reported problem is caused by the | character used in the regular expression. The pipe is used as the OR operator in the fmf.utils.filter() function and thus the expression is split into two resulting into an invalid regexp. There is a pull request proposing support for escaping the pipe character:

@kkaarreell, could you please confirm the proposed escaping would work for and cover your use case? Can we move the issue to the fmf project?

kkaarreell commented 6 months ago

@kkaarreell, could you please confirm the proposed escaping would work for and cover your use case? Can we move the issue to the fmf project?

I guess it will work. I do not like it TBH because that would make such a regular expression to look differently that one would expect from a regular expression, but I expect this is a price to pay due to fmf.filter design.

psss commented 5 months ago

Support for escaping the | character merged in #220.