Closed sskeirik closed 3 years ago
I think our design intention on only setting a maximum on concat
is as you say, everything else is implicitly limited by its source. concat
is the one place where memory usage of a bad program could have run away, so we put an explicit check there. We don't enforce explicit limits elsewhere for some of several reasons: run speed slowed down by too many run-time checks; code complexity managing checks and unit tests of that extra code; what doesn't need to be defined now can be flexibility for the future. If someday we allow larger programs we'd have to evaluate at that time what the appropriate RAM limit would be. Maybe the implicit bound of the program's constants would still be enough; maybe we'd raise the cap on the 4k concat
output. If we make tighter bounds now, we'd also have to have version-aware checks for all those bounds. if ((version < 3 && size > 4096) || (version >=3 && size > 65536)) { error }
starts to get unweildy, especially if it needs to happen several places in code.
That's a fair point about unwieldy version/size checks. If future code complexity is more of an issue than the runtime cost, I think the implementation could be simplified slightly by:
n
or greater, assign some constant, e.g. MaxByteArrayLen
, to the desired value, depending on the version.if (version >= n && size > MaxByteArrayLen) { error }
As far as I can tell, this would keep the check code consistent and short at the cost of loading a global constant.
Regardless of whether this kind of check is valuable for the codebase, I am in favor of documenting that a max sized byte array exists either by:
I think this piece helps contribute to the bigger picture that TEAL is a language explicitly designed to be bounded in both the time and space dimensions --- an important selling point of the language, in my opinion.
Along with this, in this same direction, one could note that TEAL programs have a maximum stack size, either explicitly or just noting that a maximum value exists.
And I believe, with these two pieces in place, one can calculate the maximum possible memory usage by any TEAL program by doing:
maximum byte array size * (stack size + scratch space size)
As far as I can tell, this information is not documented in the TEAL overview [1] or specification [2] pages:
[1] https://developer.algorand.org/docs/features/asc1/teal/ [2] https://developer.algorand.org/docs/reference/teal/specification/
But I may just have missed something. In any case, what are your thoughts about documenting this if it isn't already documented?
I'm protective of specification but I think it would be fair game to note this in 'guide' type documentation. Somewhere in the description introducing the TEAL execution environment and its uint64 and []byte types we could note that byte strings are currently limited to at most 4096 bytes. I think non-specification documentation is a good place to talk about the implications of the specification. e.g. all inbound constants are limited to < 1k and concat
is limited to 4k therefore the largest value is 4k. Or, the largest value is 4k and it takes $foo instructions to build it and so the largest memory footprint possible is $bar.
Writing the text in guide form seems reasonable to me. Anything that helps build better mental models of system behavior is a win!
Also, as you say, there are tighter upper bounds on memory usage because of program size restrictions, which I haven't carefully worked through, but presumably one could prove the tightest upper-bound possible. An idea for that: a program which loads the a byte array of say 1024, uses dup and concat to get it to the max size, and then dups for the rest of the program or (if successful termination is required), follows that by a sequence of drops + pushing a final integer constant.
In the current implementation copies of a byte string are by reference so take up no additional space. Assigning the same 4k string to many storage positions doesn't cost anything. A bad program might construct a 4088 byte string and concat that with an incremented 8 bytes from itob
, thus making many different 4k strings. If someone managed to build 400 of those in a 1000 byte program that would still only be about 1.6 MB.
OK, that is good to know. If the details of this upper bound have been worked out carefully, then providing a tighter upper bound on memory usage could be a fun fact that could be added to the guide text.
In any case, I don't think I have any more comments on this issue. Thanks for your time.
@JasonWeathersby this looks like the end result here would be documentation changes, thoughts?
Closing this because Brian has already taken care of it. Nothing left to deliver here.
Issue
From my reading of the TEAL specification/implementation, the intent is that TEAL byte array values on the stack should have a maximum size. This intent is currently enforced in a very implicit manner --- and the spec is officially silent on this issue. I think having a clearly specified maximum size in the specification and enforcing it strictly will both:
Current Situation
Currently, the TEAL specification implicitly enforces a maximum size on byte array values in the following manner:
txn
/gtxn
/global
opcode families: scalar transaction/global fields are known to have a fixed size or maximum sizetxna/gtxna ApplicationArgs
: the total size of all application arguments is bounded to a maximum sizetxna/gtxna Accounts
: accounts have a fixed size representationarg_n
/argc
opcode families: argument size is limited implicitly by the maximum length of a logic signatureload
opcode: can only load what has been stored withstore
, so not a true form of inputbyte_n
/bytec
opcode: byte constant size is limited implicitly by the maximum length of a logic signature or application program/clear state programconcat
this is the only opcode which can generate a byte array of a larger size --- the size of the resulting byte array is explicitly constrained by the TEAL interpreter to 4096 bytes or lesssubstring
, only generate byte arrays of the same size or smaller --- so the maximum resulting byte array size is still bounded by the size of type (1) opcodesitob
strictly does not operate on a byte array on the stack, but always produces one of a fixed size, so it is unproblematicPotential Problems
In the future, if:
then: the current intent of the implementation (as best I understand it) to bound byte arrays to 4096 bytes or less will be violated.
Potential Responses
If my reading of the issue is right, then I think the following could be helpful:
Obviously, adding checks means contracts might take slightly longer to execute, so it depends on if this change is worth that cost + the cost in developer time to implement it.