Closed kc1212 closed 5 years ago
Actually, data-types 1 and 2 can be group together as "data inside of ByzCoin" and data-types 3 and 4 are "data outside of ByzCoin". TLDR: verification using data inside of ByzCoin is easy, otherwise it's hard and we need an oracle or a collective oracle (coracle?) like authprox.
@nkcr could you envision something like this being used in your project? or it's best to wait for more info on the exact additional data that's going to be available...
Various scattered thoughts:
I like the classification of the types of contextual data around a verification decision. I especially like the idea that this is potentially the moment to build in the "plug in spot" for externally validated info (i.e. oracles). But is Protean a better way to do that? Is it possible ByzCoin should stay out of that entirely?
It is not clear how this would work in offline DARCs, I found it surprising that you propose to add this in byzCoin contract verification instead of in the DARC implementation directly.
The Swiss Re project is going to require good feedback, i.e. your request to validate project X was refused because data set Y requires that you are an employee of subsidiary Z, but you are not, and data set Q can only be used for line of business R, but your project is in line of business S. Feedback for why a transaction is refused has been a problem for a long time, but we need to make sure that darc support for additional data doesn't make it harder to solve.
Various scattered thoughts:
I like the classification of the types of contextual data around a verification decision. I especially like the idea that this is potentially the moment to build in the "plug in spot" for externally validated info (i.e. oracles). But is Protean a better way to do that? Is it possible ByzCoin should stay out of that entirely?
It's certainly possible that the Swiss Re project can be implemented on Protean. But to solve our immediate problems I don't see how Protean will help us significantly more than just using one or more (decentralized) oracles plus byzcoin. I also don't know too much about the access control story in Protean.
It is not clear how this would work in offline DARCs, I found it surprising that you propose to add this in byzCoin contract verification instead of in the DARC implementation directly.
This is a good question and maybe we can do some of it offline. The reason I only thought about doing it online is because the offline part of our current DARC verification only has the evaluation of the expression. All the other verification that depend on the byzcoin state like signer counters as well as delegation can only be done online. To do it offline, we need to write a function that's similar to darc.EvalExpr
which takes a callback to access the oracle or byzcoin state.
The Swiss Re project is going to require good feedback, i.e. your request to validate project X was refused because data set Y requires that you are an employee of subsidiary Z, but you are not, and data set Q can only be used for line of business R, but your project is in line of business S. Feedback for why a transaction is refused has been a problem for a long time, but we need to make sure that darc support for additional data doesn't make it harder to solve.
Good point, maybe we should fix the current error message situation first...
One thought that we have from C4DT is to add a new expression-type of the following form:
evm:bevmID-addressID-viewMethod
Where the different elements are as follows:
bevmID
refers to the InstanceID of the ByzCoinEVM instanceaddressID
is the Ethereum address in this EVM instanceviewMethod
is called on the contract at addressID
and must return a boolean. If it is true, this part of the expression resolves successfullyThis would allow to have any kind of action that is expressible in smart contracts on Ethereum, with the added benefit of being able to write the contracts using the Stainless verification framework. Like that, instead of having to compile the new code and restart all conodes, you could update the smart contract and reference it from the Darcs.
An alternative that we should explore is instead of using an extra field like AdditionalRules
, we could introduce new "identities" that represent additional requirements. For example: invoke:modify_record => ed25519:doctor1 & time_interval:9AM;5PM
. In fact, those are not identities anymore, they are "atoms" which can be evaluated to either true or false. There needs to be a new callback that checks for the non-identity atoms in darc.EvalExpr
.
yes, I called the identities
expression-type
instead, but it's the same idea. And instead of adding new identity-types over time, only add one identity-type that points to evm contracts.
This would also be the missing link between the (b)evm implementation we're currently doing and the rest of OmniLedger.
An alternative that we should explore is instead of using an extra field like
AdditionalRules
, we could introduce new "identities" that represent additional requirements. For example:invoke:modify_record => ed25519:doctor1 & time_interval:9AM;5PM
. In fact, those are not identities anymore, they are "atoms" which can be evaluated to either true or false. There needs to be a new callback that checks for the non-identity atoms indarc.EvalExpr
.
I like the idea and I think this is going to right direction.
It seems that we are discussing a rather large change, like migrating the authorization model from Discretionary Access Control (DAC) to Attribute-Based Access Control (ABAC) (nice description here).
Maybe we should take some inspiration from Next Generation Access Control (NGAC)? => "NGAC is a flexible and expressive approach to specifying and enforcing a wide variety of policies over distributed systems.[1]".
RBAC / ABAC'
Very nice link, thanks. Now I learnt that Darcs are Role Based Access Control ;) And the next step should be ABAC, so far we're on the same page.
I looked quickly at NGAC, and there seem to be multiple instances of elements that will be difficult to implement in a consensus-system like ByzCoin, because when you want to replay a block, you might not get the same answers. Also, when doing consensus, each server will have to access each of the blocks in NGAC, which might be not possible due to firewalls. You could solve this by using something like https://www.town-crier.org/what-is-tc.html for that, but that seems even more complex.
XACML also looks very interesting, but I don't see how you would like to link the expressions to instances in ByzCoin. One example they give is this:
Allow access to resource MedicalJournal with attribute patientID=x
How do you define the resource MedicalJournal
, and more importantly, how do you fetch the attribute patientID
?
Also, whenever you want to add a new type of resources, you need to update all nodes to do so.
Which leaves me with the proposition I made about pointing to Ethereum smart contracts:
evm:bevmID-addressID-viewMethod
identityWhich leaves me with the proposition I made about pointing to Ethereum smart contracts:
* easy replay of blocks, as all elements/attributes are stored in the global state * easy update of using new attributes/rules through new smart contracts * one-time change to darcs by adding the `evm:bevmID-addressID-viewMethod` identity
I think this is a good direction, i.e., allow new types of rules in the DARCs.
The point on replaying blocks is an important for this discussion. Do we want to allow access control policies in the DARCs that cannot be replayed? Policies based on time, for example, cannot be replayed. Is that something we want to support or is it better to offload it somewhere else? For example, authprox authentication cannot be replayed because it may evolve over time. But the conodes collectively sign the authentication request and we trust the identities when we replay it.
My intuition so far tell me DARCs should only deal with access control policies in the context of byzcoin, which means we only deal with requests that can be replayed. If there are requests/policies that cannot be replayed, then it needs to be offloaded to a (collective) third party which byzcoin trusts. DARCs should not understand how the third-party verification is done and it only cares about the signatures in the transactions. Depending on the policy, we can also use sysadmin techniques like firewalls, etc.
Do we want to allow access control policies in the DARCs that cannot be replayed?
No, we don't want to do that. And we never should. A blockchain must be self-contained and be able to hold all its proofs within the blocks its made of.
If you want to allow for environmental proofs (I don't know the correct name) you're opening a big Pandora's box. You will have to make sure that all nodes see the same thing, and you will not be able to proof afterwards that what you did was correct.
Policies based on time, for example, cannot be replayed.
If you ask an EVM contract about its time, it won't take it from the RTC-clock of the computer, but from the block itself. And we trust the block because 2/3+1 of the nodes checked the time is correct by verifying that its time > than the last block and not too much in the future. So time-based policies CAN be replayed.
authproxy authentication cannot be replayed because it may evolve over time.
Authproxy's goal is to create a signature from the outside that is stored in the blockchain and that will be used by the system to proof what happened. If you replay, the smart contract will not ask authproxy to create another signature, but will trust the stored signature. In that way it's similar to TownCrier, but it doesn't use SGX, but the cothority's multi-signature scheme. So it's OK to create a signature and then trust this signature, because then you 'bind' the expression to this signature.
DARCs should only deal with access control policies in the context of byzcoin
Definitely yes.
Thanks for the feedback Linus. To summarize, we think it's a good idea to DARCs only deal with policies that can be verified (and also replayed) in byzcoin. As such, any access control policies that depend on the environment will not be address by DARCs directly. The approach of authproxy could be used in such cases.
To allow more forms of access control policies, we plan to introduce a new type called xattr
(inspired from extended attributes in file systems which is also used to implement additional access control policies that the typical file system attributes cannot support) in addition to ed25519
, x509
and proxy
. What goes after xattr
is the name of it, .e.g., xattr:evm
. What is next is flexible but with a few restrictions. ~We do not support characters such as |
, &
, (
or )
because it may introduce parsing ambiguities~ so the regex will look something like this: xattr:[0-9a-zA-Z]+:[^ \t\n]+
. Note that due to the flexibility, it is also possible to write a json string, for example xattr:evm:{"bevmID": "DEADBEEF"}
.
The verification function used in the contract needs to parse and verify the extended attributes. Thus we introduce an evaluation function:
func EvalExprXattr(expr expression.Expr, getDarc GetDarc, evalXattr func(xattr string, ids []string), ids ...string) error
The evalXattr
callback is the new element. The caller needs to provide it if the expression contains an extended attribute. The first argument is the extended attribute itself, the second argument is the list of identities that signed the transaction which the evalXattr
implementation may choose to use. Typically we implement evalXattr
in the contract verification section.
EDIT: the ambiguity problem is not a problem
Let's try to implement a few use cases. Say we check a read request on a Project contract that specifies multiple datasets to query: invoke:project.read(dataset1, dataset2, ...)
:
Dataset X can only be accessed by people from group G and not for financial or risk assessments purposes
DARC
invoke:project.read - ed25519:<identity of group G> &\
!xattr:for_purposes:["financial", "risk_assesment"]
Dataset X can not be used in conjunction with dataset Y if outside the EU
invoke:project.read - !xattr:has:[<dataset id X>, <dataset id Y>] |\
xattr:ip_address:{approved_region: UE, trusted_key: <3rd party key>}
Dataset X can only be used with dataset Y and before 2020. If used outside the EU, it must not be for financial purposes. After 2020, dataset X can only be used for legal obligation
invoke:project.read - ( ( xattr:has:[<dataset id X>, <dataset id Y>] &\
xattr:time:{before: 2020} ) &\
( !xattr:for_purposes:["financial"] |\
xattr:ip_address:{approved_region: UE, trusted_key: <3rd party key>} ) ) |\
( xattr:time:{after: 2020} &\
xattr:for_purposes:["legal_obligation"] )
i think this works, the only question i have is how is the purpose of the query verified? can you derive the purpose directly from the query/transaction? or the query/transaction must declare the purpose and we trust that the declaration is correct
the query/transaction must declare the purpose and we trust that the declaration is correct
Yes, that's it. At the end nothing prevents a user that gained access to some datasets under some restrictions to use it for something else.
I am a bit concerned about the complexity of the rules, which I imagine can be much more expressive than my examples. In the previous example I defined the rule at the query level, but it would be better if each dataset defines it's own rules (ie. each dataset is an instance of a contract and has its own DARC), then the only rule for the project's query would be:
invoke:project.read - xattr:check_dataset_darcs
where check_dataset_darcs
ensures that each rules of each selected dataset (each dataset's DARC) is followed. But then, the dataset's DARC must somehow have access to the selected datasets when checking the rule (for rules like "I must not be used with dataset Z").
The xattr
rules are only in the DARCs, so they would be associated with the datasets. The format of the properties on the query (e.g., whether the purpose is financial or legal) is not specified and it's to the contract author to specify it. So I'm imagining something like this:
dataset x has rule x: invoke:project.read => xattr:swissre-purpose:financial,legal
dataset y has rule y invoke:project.read => xattr:swissre-usewith:dataset_x | xattr:swissre-purpose:legal
client sends a transaction:
invoke:project.read
purpose: financial
dataset: x and y
Here the arguments purpose
and dataset
are going to be used in the verification function.
byzcoin sees the transaction and tries to verify it in the following way:
xattr:swissre-usewith:dataset_x
succeeds because the transaction is using dataset xxattr:swissre-purpose:legal
doesn't need to be evaluated because the expression is an ORIf you want something concrete, here's the implementation for the DARC part https://github.com/dedis/cothority/commit/53d688eb61775e064fe9357a7bde15c26c2f9164
Following on Linus' comments, I believe this could be a great application of the EVM integration within ByzCoin. I am working on documenting this integration, as well as how it could be used for DARC extensions. Stay tuned... ;)
I created a PR https://github.com/dedis/cothority/pull/2017 that implements the idea. There is a test that hows how to use xattr in contracts. Let me know if you have any comments.
Discussing it with @cgrigis this looks great. There might be some additional information, like the last successful block, that should be passed to the verification callback method. Perhaps that can go in the ReadOnlyStateTrie
?
In general the skipchain related information is not easily available in the verification function, except the index. I can imagine it being useful for more sophisticated verification functions. OTOH I feel ReadOnlyStateTrie
should only deal with information in the trie. Perhaps we need a ReadOnlySkipchain
interface and change VerifyInstruction(ReadOnlyStateTrie, Instruction, []byte) error
to VerifyInstruction(ReadOnlyStateTrie, ReadOnlySkipchain, Instruction, []byte) error
, which would be an incompatible change... maybe a sign to move to v4 @jeffallen ?
OK, I see your point. Yes, a ReadOnlySkipchain
, or SkipchainState
, or GlobalState
... might effectively be a nicer name than adding a GetLastBlock
to ReadOnlyStateTrie
.
The current version of the EVM integration documentation is available here (it's still WIP...).
On Friday's meeting we agreed that the functionality will be essentially the same but the implementation will change significantly.
The extended attributes continue to live in the DARCs but the verification procedure will be different. Let's explain it by stating which part of the existing code will be changed.
At the moment the instruction verification function computes a list of "good identities" and then passes that into darc.EvalExpr
. Instead of calling the extended attribute verification callback in darc.EvalExpr
as in the old design, we'll produce a list of good identities that may include extended attributes. The advantage of this approach is that there will be minimal change in DARCs other than supporting the xattr expression.
Consequently, we need to find a way to figure out which extended attributes to verify that must go into the "good identities" list. For example, suppose a rule looks like pk1 & xattr1 | xattr2 | ... | xattrn
, it's not a good idea to evaluate all the xattrs because many of them won't matter for the final result of the expression. It's also difficult to intelligently figure out which xattr to verify such that the expression will be positive. In the case above we need an intelligent way to do "try xattr1 if it fails try xattr2 etc". In fact, if we traverse the expression to find the right xattr to evaluate, it would be very similar to the old solution. The most reasonable option I can see is to ask the client to provide the correct the right xattr in the transaction. That is, right now we have []Identity
and []Signature
in the instruction, we could add a new type of "identity" that is an xattr with a corresponding empty signature.
This morning it was decided that this feature will go into v4 due to its incompatible API changes. Concretely, we're imagining that there will be a change in the VerifyInstruction
function from the Contract
interface to support a more general structure, e.g., GlobalState
. Which contains the trie as well as the skipblocks.
Consequently, we need to find a way to figure out which extended attributes to verify that must go into the "good identities" list.
The way I see it it's very simple: every Signature
the user gives must resolve one identity. It's either an identity that can be verified with a public key, or it's an identity that can be resolved with an xattr.
So you don't need to do any recursive search for the xattrs. You keep the loop in https://github.com/dedis/cothority/blob/master/byzcoin/transaction.go#L350 and add all good identities.
This means that only xattrs with a corresponding signature will be resolved. So if you have a xattr:time:8am<t<5pm
, then you need a corresponding 'signature' proposing a time for this 'identity'. The Verify
method, instead of checking the signature against the public key in the darc-expression, will now have to check the xattr-signature if it's valid (using block-time and not RTC time), and if it is, add the xattr-identity to the "good identities".
I started with the plan you described (if I understood it correctly), i.e., unifying the attributes with the identities/signers. It was easy to create an attribute-identity but as I started writing more code it became unclear whether creating an attribute-signer is a good idea. It's also not clear what an attribute signature would look like. Is it just a dummy signature?
The client needs to create a transaction, so he/she needs to give []darc.Identity
and [][]byte
which are the signatures. These are typically created using FillSignerAndSignWith(signers ...darc.Signer)
. So the first uncertainty I had was whether there should be attribute signers and what will the signatures look like? I started by implementing dummy signatures but that didn't feel right because the only thing you need are the attribute-identity to evaluate an attribute, you don't need the signatures. Further, we don't have a SignerDarc
but we do have an IdentityDarc
. I think that's also because DARCs can't sign transactions. The current implementation doesn't have a SignerAttr
but it has a IdentityAttr
, similar to the DARC situation. But these attributes are kept in a separate field in Instruction
(not sure if that's good or bad yet, it's just the nicest approach I found at the time of writing).
Ops, closed by accident.
One more thing I find not ideal is that the client needs to know the attribute to use so that he/she can put it in the transaction. Which means clients need to keep themselves up to date with the DARC.
Sorry for the late reply - 1st of August and all.
I did a short writeup here: https://docs.google.com/document/d/13p2_kE8nmWfBmsP9XzvjPr9LuMrhBHLitabJVcMIzdY/edit?usp=sharing
It should answer the following questions:
IdentityAttr
- not neededHowever, I got aware while writing this that there is a nasty security bug that allows a node to change the attribute-'signature'!
Thank for the write-up @ineiti.
Last week I was investigating the pros and cons of the approach that you wrote down. I still have a few doubts on the dummy signatures and the need of attribute signers.
The reason to use dummy signatures and attribute signers is because we want to include the attribute in the instruction. After trying a few approaches of implementing this feature, I think it is not necessary and causes extra overhead on the client. The reason is the following. For any attribute verification, the input of the verification is the global state the output is an error
type. So we don't need a dummy signature because that would be an additional input. If the verification needs additional input, e.g., existence proof like in calypso, then this input is a payload and it should go into the payload part of the transaction. The signature shouldn't contain payloads.
Further, many attributes wouldn't make sense to have a dummy signature. For example, if the attribute says a certain action can only be performed after block index x, then the client cannot "sign" the transaction with the right block index because we're in an asynchronous network. The conodes will determine the block index used for the verification. In other words, I think the example from the document Signatures: { ..., []byte(“9:14”) }
does not apply for many attributes.
Jeff and I decided to take a step back and explore and extend the original design that uses callbacks. The goal is to make sure it fits the needs of bevm, while making sure we can still use it in v3. The needs of bevm, if I recall, is to have a few default implementations for attribute verification. We can do that by extending BasicContract
(there is a test that shows how a child contract is using the attribute verification logic of the parent contract). The current code is not backwards compatible because Contract
interface is expanded. We can fix it by doing the casting trick on a small interface.
An issue raised during the Friday meeting on the callback approach is that the evaluation of the identities/signatures (which form goodIdentities
) is not performed at the same place as the the evaluation of attributes, which is done in EvalExpr
. I argue this approach is ok and it fits to the current convention. We don't need to do the attribute evaluation beforehand because clients are not telling us what to evaluate (see above for why it's better). It is also consistent with how the DARC identities work. DARC identities do not have signers and they are also evaluate in EvalExpr
on demand.
In summary, I'm in favor of the the callback approach because clients do not need to worry about sending the right attributes along with the transaction, we are able to implement it in a way that is backward compatible and it supports default attribute verification. Finally, this approach does not have the signature problem because there are no dummy signature.
What's your opinion @ineiti @cgrigis ?
The implementer's always right ;) At least it solves the issue with respect to securing the attributes passed.
You will still need to pass some attributes, like purpose, where would those go?
As I wrote on slack: can you please modify the document I linked here to reflect your proposed changes? The goal is to have a document that reflects what is actually implemented, so feel free to remove whatever you think should not apply.
Sure, I'll update the document.
To answer the question on purpose, if some invoke command needs to be authorized differently depending on the purpose, then it's best to create multiple invoke commands. When that's not possible, the purpose goes to the transaction payload (e.g., one of the arguments of invoke), similar to what was suggested a few weeks ago.
The document and the implementation is updated.
It is backwards compatible except the VerifyWithOptions
function is extended so that the caller may provide a list of callbacks to do the attribute verification. The casting trick is needed to convert a ReadOnlyStateTrie
to a ReadOnlySkipChain
, i.e., roSC, ok := rst.(byzcoin.ReadOnlySkipChain)
. I imagine this will go away in v4. The test contains examples of using the attributes. It shows how to verify the attribute using the skipchain and additional data (i.e., the purpose scenario), as well as combining attribute interpreters from the parent class.
FWIW, I like it, particularly the fact that there are no longer "dummy signatures", as I was a bit uncomfortable with it ;-) I'll post more comments if I have them after I am done reading your updated doc and implementation.
Some of our partners may need to do additional verification using DARCs other than relying on only public keys. For example, our DARCs cannot verify a statement like "requests for medical records must be made between 9 AM and 5 PM".
The type of additional data is not clearly specified at the point so our implementation needs to be extensible. Nevertheless, from looking at a few examples in the XACML spec, we are able to classify the types of additional data into the following categories.
For the first two data-types, we can implement it in the following way. Add
AdditionalRules map[Action]Rule
to the DARC struct. Where the key is anAction
which must be an existing rule inRules
. We also add afunc (instr Instruction) VerifyAdditionalRules(st ReadOnlyStateTrie) error
which has the ability to verify some defaultAdditionalRules
. Suppose a doctor is only allowed to modify patient data between 9 AM and 5 PM. The standard DARC rule would look likeinvoke:modify_record => ed25519:doctor1 | ed25519:doctor1
. Then the additional rule would look liketime_interval:9AM;5PM
which is keyed oninvoke:view_record
. IfVerifyAdditionalRules
implementstime_interval
verification and is used by default inBasicContract
, then the rule will be enforced automatically.For the third data-type, we need an oracle to do the verification. I believe the rule can still be encoded the same way as above in
AdditionalRules
. But the contract author needs to write his/her custom verification function that contracts the oracle.Finally, the fourth data-type cannot be easily verified collectively because the information needed for verification only exists on one node. If it must be done, we need to offload the information to the oracle. For example, the oracle independently checks the IP addresses of clients and the conodes will trust the oracle to correctly record the IP address.
Here's the proposed change and it should be backwards compatible (only the new types and functions no implementation).