algorand / go-algorand

Algorand's official implementation in Go.
https://developer.algorand.org/
Other
1.34k stars 468 forks source link

Move eval constants into consensus and normalize #3746

Open jeapostrophe opened 2 years ago

jeapostrophe commented 2 years ago

I had an unwelcome discovery today that there are some constants in the TEAL evaluator that aren't in consensus.go and others that have different values...

For example MaxLogSize isn't in it: https://github.com/algorand/go-algorand/blob/master/data/transactions/logic/eval.go#L59

And MaxLogCalls is 32 in eval.go but in consensus.go is something really really big --- https://github.com/algorand/go-algorand/blob/master/config/consensus.go#L536

I know it takes a long time to add stuff to consensus parameters, but please put this on your list

Also, I'd like MaxLogSize to be a bit bigger, like bigger than MaxAppTotalArgLen

jannotti commented 2 years ago

There are three kinds of constants being discussed here. In config/consensus.go there is a ConsensusParams type, and its fields are generally considered "consensus parameters. There are also some package level variables, often with similar names -- compare, for example the field ConsensusParams.MaxTxGroupSize and the variable config.MaxTxGroupSize. Then there are constants defined for the AVM, like logic.MaxLogSize.

Let's dispatch with the simplest one first. The package level variables like config.MaxTxGroupSIze, or the one you've mentioned config.MaxLogCalls are only used as "allocbounds" - values that lie outside the consensus protocol and simply need to be big enough to allow the encoding of any of Algorand's msgpacked structs, regardless of protocol level. Notice that most of them are set to the largest value from any protocol level. You don't want these exposed to the protocol (as globals in AVM for example), as they are an internal implementation detail that helps prevent a malicious node from sending eating memory during message decoding.

Now, distinguishing a consensus parameter from an AVM constant is trickier. For example, the max size of an avm byte value is currently 4k. I don't think it would be appropriate to consider that a consensus value. Shrinking it that way would cause havoc, because AVM programs that have already been written and deployed might start to fail where they previously would have suceeded. But it would NOT be as disruptive for the max byte size to shrink for future AVM versions (though admittedly it might not be a great idea, it wouldn't be absolutely crazy, as shrinking the size for all programs would be). This means that these constants need to be gated on the TEAL version of the running program, not the consensus protocol version.

It's unclear which of those MaxLogCalls should be! Currently, in some sense, it is neither, because it has not been changed. If it were raised, as you're asking for, I suppose maybe it would be considered ok to apply it to previous teal versions (though an argument could be made that even that could create a problem for programs that want to fail if they log 33 times). If it were shrunk, previous teal versions would surely need to be grandfathered in, and it would be an AVM local thing, gated by program version, not protocol version. If that happened, I'd be inclined to say that the assembler should simply provide a constant, based on the version being assembled, not a global. I don't think we have any interesting constants like that. The version gated things I can think of are things like enabling backjumps. I don't think that's a flag you need access to in a program. If OpcodeBudget gets changed (raised, I hope!) that'll be an interesting example.

All of this is to say that I'm not sure how to best address this issue. I don't really think these are consensus parameters, they are likely locked forever in the teal versions published so far. I'm sure you would be loathe to waste precious bytecode space and opcode budget issuing global MaxLogCalls if, instead, the assembler could simply allow MaxLogCalls to turned into a hard coded integer (depending on version). It could then even be used as an immediate.

jeapostrophe commented 2 years ago

This is very enlightening, thank you. I agree with your assessment of MaxLogCalls/etc and the value of TEAL version determined constants.

To be clear about my needs, I don't care so much about TEAL programs being able to observe the values. My main need is to be able to easily find out what the limit was for my code analyzer. I was initially using the alloc bounds for the calls rather than the number defined in eval.go.

It's unclear which of those MaxLogCalls should be! Currently, in some sense, it is neither, because it has not been changed. If it were raised, as you're asking for

Another point of clarity... I want MaxLogSize to be more. I use Logs to recreate what are called Events in the Ethereum world. It is very typical for a program to emit an event for each function that is called. For example, in the ERC721 spec, when an NFT is transferred, you call transferFrom(from, to, which) and an event Transfer(from, to, which) is emitted. You might think that you could observe which functions are called and thus you don't need the events, but the same action could be caused by multiple function calls; for example there's also safeTransferFrom. In older versions of Reach, before log, I did that by having clients observe what functions were being called, but I've switched now to using events (logs). But, the problem with this is that MaxLogSize is 1024 and MaxAppTotalArgLen is 2048, so you can't actually log every function call. Even if you made the size of logs equal to MaxAppTotalArgLen, then you couldn't emit any more logs. My ideal would be that MaxLogSize was 2x MaxAppTotalArgLen.

jannotti commented 2 years ago

In a sense, with the way Algorand's inner transactions work, that might all be duplicate work. If there's an inner call to another contract, the entire call is automatically going to end up in the EvalDelta of the caller (and so on, recursively), on the chain. I don't think you gain anything by logging it explicitly, assuming you can read those on-chain breadcrumbs.

I just reread you comment about the same event from multiple calls, and don't quite understand it, but it doesn't seem like a great idea to try to duplicate every call for a corner case like that. Maybe log something only in those special cases?

jeapostrophe commented 2 years ago

When I say "the same action could be caused by multiple function calls", I'm referring to pseudo-C like this:

external transferFrom(from, to, which) {
 unsafeTransferFrom(from, to, which);
}
external safeTransferFrom(from, to, which) {
 require(nft[which] == from);
 unsafeTransferFrom(from, to, which);
}
internal unsafeTransferFrom(from, to, which) { 
 emit Transfer(from, to, which);
 nft[which] = to;
}

There are two entry-points into the code... transferFrom and safeTransferFrom... and they both have the same effect and events are logs of effects so that external observers can know that the effect happened.

An external observer wants to know "Was the NFT transferred?".

They can do this by knowing that the transfer happens by knowing the internal implementation details, that it happens when transferFrom or safeTransferFrom is called and thus look for explicit function calls happening. But by looking at function calls, they are not robust to when we add a new function:

external transferEverythingIOwn(from, to) {
 for ( which in mine[from] ) {
  unsafeTransferFrom(from, to, which);
 }
}

Instead, that external observation that they care about ("Was the NFT transferred?") is part of the external interface of the program... through an Event... rather than having the external observers need to know about the implementation details.

But, this relies on events being able to have as much information as the calls, because it is often the case that calls are the "source" of the event information.

I am just describing the "status quo" on smart contract platforms as exemplified by Ethereum.

barnjamin commented 2 years ago

If its the effect you're looking for, and the analog effect in Algorand is a global state change to represent a transfer, wouldn't that be in the EvalDelta?

algoanne commented 1 year ago

recapping the above, it seems there's one request here:

I want MaxLogSize to be more

and is this still important to you, @jeapostrophe ?

jeapostrophe commented 1 year ago

Yup