expected python in-toto to consume 'REQUIRE' layout item.

cokieffebah commented 3 years ago

https://github.com/in-toto/docs/blob/master/in-toto-spec.md#4331-rule-processing

Artifact rules reside in the "expected_products" and "expected_materials" fields of a step and are applied sequentially on a queue of "materials" or "products" from the step's corresponding link metadata. 
They operate in a similar fashion as firewall rules do. This means if an artifact is successfully consumed by a rule, it is removed from the queue and cannot be consumed by subsequent rules.

is inconsistent with the behavior of the reference implementation: https://github.com/in-toto/in-toto/blob/develop/in_toto/verifylib.py#L1053

# Initialize empty consumed set as fallback for rules that do not consume
# artifacts. All rules except "disallow" and "require" consume artifacts.

In an 'expected_materials', I had a REQUIRE followed by a DISALLOW * and was confused that the disallow kept triggering. We eventually decided to implement: ["REQUIRE", "product/index.html"], ["ALLOW", "product/index.html"], because ALLOW consumes the item. The above code seems awkward I think REQUIRE should consume the item. Let me know if I should open an issue with python in-toto ... or if this documentation is incorrect.

Thanks

lukpueh commented 3 years ago

Thanks for reporting this inconsistency! I agree that the specification is a bit vague in this regard and that your example makes a good argument for having the REQUIRE rule also consume artifacts.

IIRC we chose to not do this in the reference implementation (nor in the golang implementation for that matter), in order to better draw the line between artifact rules that "only" consume artifacts in the queue, and rules that can make the verification fail [see in_toto/verifylib.py#L972-L976]:

    The consumption of artifacts by itself has no effects on the verification.
    Only through a subsequent "DISALLOW" rule, that finds unconsumed artifacts,
    is an exception raised. Similarly does the "REQUIRE" rule raise exception,
    if it does not find the artifact it requires, because it has falsely been
    consumed or was not there from the beginning.

But yes, your example shows that the behavior can be awkward in practice. Until we fix this, I'll add it to our list of artifact rule related considerations (#4).

On a related side-note: We added REQUIRE rule support relatively late to the implementation, because we weren't sure about the expected behavior with regards to the rule's file pattern (see in-toto/in-toto#193). In the current implementation, the REQUIRE rule is the only rule that does not accept a file pattern, which is also not reflected by the specification.

cc @SantiagoTorres

cokieffebah commented 3 years ago

Thanks for the explanation, maybe the 'REQUIRE' behavior should become two different rules. I think

'... because it has falsely been consumed or was not there from the beginning'

are actually two different use cases and I was using 'REQUIRE' for the latter. Basically I expected 'REQUIRE' to be a stronger 'ALLOW' with error. I am curious about the use case for the former case of 'falsely consumed'

sudo-bmitch commented 3 years ago

Include me in the list of people confused by this behavior since it didn't match the documentation. The behavior I expect:

["ALLOW", "X"]: consumes X, no error if X is missing ["REQUIRE", "X"]: consumes X, error if X is missing

In both cases, I'd expect wildcards to be valid, and if X contains a wildcard that matches nothing, REQUIRE would error.

SantiagoTorres commented 3 years ago

Hi! This is something we've been trying to nail down for a little while, and I'd be more than happy to help it be more intuitive/powerful. Consider the following rule:

REQUIRE *

Put in prose, it means "require everything", yet if in practice "everything " is "nothing", then the result is counter intuitive (i.e., the previous link reported nothing, but the requirement wasn't fulfilled because everything can't be nothing). Part of me feels that we could extend the semantic to make this slightly more expresive (e.g., by changing the * to a +, regex style). This way we could perhaps say "everything can be nothing" or "everything should be something".

This of course becomes a little more nuanced when the wildcard is not just * but foo* or so...

I wonder if this, in a sense, could allow us to settle on whether require should consume as well as the other rules (i.e., then it would become a stricter match)

cokieffebah commented 3 years ago

I think I know understand my confusion ... @SantiagoTorres or @lukpueh correct me if I am wrong ... the REQUIRE and DISALLOW were intended to be the 'final' handlers and not intended as intermediate processing filters that consume artifacts. Also, only one of REQUIRE|DISALLOW was intended, basically making you choose between (roughly) blacklisting and whitelisting. When I used REQUIRE, in my case that generated this ticket, I ended with a DISALLOW while expecting the REQUIRE to throw an error or consume the artifact during REQUIRE's intermediate processing.

SantiagoTorres commented 3 years ago

I think I know understand my confusion ... @SantiagoTorres or @lukpueh correct me if I am wrong ... the REQUIRE and DISALLOW were intended to be the 'final' handlers and not intended as intermediate processing filters that consume artifacts. Also, only one of REQUIRE|DISALLOW was intended, basically making you choose between (roughly) blacklisting and whitelisting. When I used REQUIRE, in my case that generated this ticket, I ended with a DISALLOW while expecting the REQUIRE to throw an error or consume the artifact during REQUIRE's intermediate processing.

Yup, exactly! I think we definitely need to make a better job at communicating this though...

cokieffebah commented 3 years ago

I am wondering if my use case is valid by in-toto standards. Should there be an intermediate, consuming, error-throwing clause? (asking, not requesting) My case/thought was 'If any of these things are missing, throw an error, and if there is anything else throw an error'

pseudo-code, its been a while since I wrote an inspection...
["REQUIRE scan.log", "from static_code_analysis"],
["REQUIRE artifact.tar.gz", "from package"],
["REQUIRE unit_test_results/*", "from unit_test"],
["DISALLOW", "*"]

sudo-bmitch commented 3 years ago

REQUIRE *

Put in prose, it means "require everything", yet if in practice "everything " is "nothing", then the result is counter intuitive (i.e., the previous link reported nothing, but the requirement wasn't fulfilled because everything can't be nothing).

To make sure I understand the non-intuitive part, what do you think this should do, error or silently succeed? My own take is this should error, and if you didn't want an error, then ALLOW would have been a better option.

Regardless of how the issue is solved, the documentation does not match the implementation, so one of these should be fixed.

in-toto / specification

expected python in-toto to consume 'REQUIRE' layout item. #45