Closed cag closed 5 years ago
So far, @Arachnid has commented on this EIP a bit in the PR. The discussion is reproduced here for convenience and expanded upon:
Can you provide example use-cases? What sort of oracles is this intended to support? Who would benefit from standardising such an interface?
The use case I had in mind originally was for answering questions about "real-world events", where each ID can be correlated with a specification of a question and its answers (so most likely for prediction markets, basically).
Both the ID and the results are intentionally unstructured so that things like time series data (via splitting the ID) and different sorts of results (like one of a few, any subset of up to 256, or some value in a range with up to 256 bits of granularity) can be represented.
Another use case could be for decision-making processes, where the results given by the oracle represent decisions made by the oracle (e.g. futarchies).
Can you expand on this in the EIP? And maybe make the title of the EIP more specific?
This seems to assume one particular type of oracle - one that returns exactly 32 bytes of data, and is a trusted party. There are many other types of oracle; what about them?
Regarding the trusted party factor: I've intentionally decided to start drafting the spec in as strict a manner as possible. With that said, there isn't a clear mandate about the authorization model, so it's not necessarily a single account which is authorized to make the report. Also, mechanisms like multisignature wallets, side/child chains, or something else may be used to distribute the trust if it was mandated to be a single account.
Regarding the 32 bytes of data: I am still debating and open to making the result an arbitrary-size blob of bytes
. My contention is two-fold:
Services like Oraclize would seem to demonstrate uses for more than 32 bytes of onchain data, however.
Yes, it is true that Oraclize does support more than 32 bytes of onchain data, and this is something which I personally am not settled on as well, but I would also be interested in hearing from the community whether or not they've got any use cases for more than 32 bytes.
Agree with @cag on the bytes32
thing, I know that's how Oraclize serves the data (as a string, which it's then up to you to parse), but I reckon what contract authors are usually doing as soon as they get that data is to squidge whatever they get from there back into 32 bytes so they can actually use it...
We've assumed everything is a bytes32
for Reality Check; On our current scheme (implemented in our dapp, the contract only knows somebody sent it a bytes32
, it doesn't understand what's in it) this is intended to map as follows:
One hairy thing about this is that you often end up wanting to express "this question is invalid" or "I couldn't answer this question". In Augur they call this " (This is a slightly different thing to -1
".isOutcomeSet()
which I would interpret as "have you reached a conclusion about this question" - which we handle separately). The natural thing is to encode it as 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
, but that clashes with an actual -1
as a signed number...
On getOutcome()
and isOutcomeSet()
, Reality Check calls revert()
in its equivalent to getOutcome()
- ours is currently called getFinalAnswer()
- if the outcome isn't set. This is intended to avoid the need to call the contract twice in a transaction, although you still want to be able to call isOutcomeSet()
to find out what's going on, most likely for UI purposes but possibly for use in a contract.
I disagree with this being restricted to just bytes32
as that is a limitation with what current Oracle solutions report back with, and will have the following side-effects:
bytes32
values to uint
, int
, bool
etc.In my opinion, it should be the responsibility of the Oracle reporting back to send the data in the right type to begin with as-in: int
, uint
etc. This would result in this interface having methods for each data type, but then there's no incurred cost to either the Oracle/end-user who needs that data. Critical with decentralised Oracle projects, as on-chain value aggregation and reputation mechanisms will increase the gas cost to any user before that.
With @cag mentioning ChainLink, writing back to on-chain contracts with different value types is something it already supports with the following: int256
, uint256
and bytes32
. Again, it's not limited to those, it's simply what's already supported in its pre-release state.
To sum up, I don't think we should be implementing strict Oracle standards before we've really seen any established decentralised Oracle projects functioning yet. I feel it's too limiting before we've really seen what Oracles will be used for and how.
@jleeh I'm all for saving gas and we jumped through all kinds of hoops to make Reality Check economical but the gas cost of these conversions is extremely small - here's a demo contract that just logs an event, one with a bytes32->uint conversion and one without:
solc --gas gastest.sol
======= contract.sol:GasDemo ======= Gas estimation: construction: 117 + 69000 = 69117 external: convUint(bytes32): 1313 noconv(uint256): 1258
So in that case we're literally talking 55 gas, less with optimization. A simple send is 21,000 - the difference really isn't worth bothering with. The code to do this is also trivial.
In Augur they call this "-1". (This is a slightly different thing to isOutcomeSet() which I would interpret as "have you reached a conclusion about this question" - which we handle separately).
This isn't accurate, Augur doesn't represent results as a single value, it represents them as an array of values. This is because the result is a distribution of assets to token holders, and that distribution may not be 100%/0%.
Invalid is represented by an equal distribution to all parties (along with a special flag to differentiate from a valid exactly-middle resolutios.).
@edmundedgar Good point, just tested the same for int256
with a twos compliment hex and got the same also, 55 gas difference.
@edmundedgar one scenario we consider is something like parsing bytes32 "425.53"
into a uint256. How would you handle that? You can cheaply change the type, but the content isn't preserved. Based on what you describe above, it seems like a type is specified in the request and then converted by the Oracle Handler. Chainlink does something similar and assumes the response is a single EVM word(function sig takes bytes32 but all that matters is that it's an EVM word) and then the conversion pretty much comes for free when invoking the callback function.
I'll confirm your suspicion that people are immediately parsing bytes32 into other types as soon as they arrive. We started with only bytes32 and got a lot of feedback parsing was a pain point. In our view, some conversions are cheaper than others, but they're all practically free for the Oracle Operator to compute, so it's better to be handled there. Also, contracts are complicated enough as it is, reducing the code they require to parse a response is beneficial for security and cost.
For the Oracle Operator to do this, being explicit about the type is important. We put the expected type it in the query to make it clear for all parties involved what will be reported. For example, if a Consumer checks if reportedEtherPrice > 100000
, and one party is expecting a uint256 but it arrives formatted as bytes32, they will always be disappointed because even a right padded "1"
would evaluate to greater than left padded 100000.
@cag some thoughts on specification and how we talk about this stuff at Chainlink: We're all for ubiquitous language and so try to be pretty explicit about naming things. Because there's two parts to an Oracle, the on-chain and off-chain part, we specifically refer to the on-chain part as the "Oracle Contract", and have historically called the "Oracle Handler" an "Oracle Node" or "Oracle Operator". Those names might not be appropriate for a more general spec, but I think "Oracle" alone can be vague when discussing interactions.
We also refer to "Consumer" for the contract that receives the Result, and the "Requester" for the initiator in a Pull style interaction. They're often the same contract, but could be different, like if a request comes in from an Externally Owned Account. We also specify two interactions for the Pull based model, "Requesting" and "Fulfillment."
I'm with @jleeh that more concrete use cases would be helpful before standardizing, but this seems like the best place to get the conversation rolling.
@se3000 We're doing what seems to be the standard Ethereum way to handle decimals which is to specify a number of decimals and deliver data multiplied by that like that - ie if you expect a USD price to a precision of 0.1 cents, you would ask the oracle for a number in milli-dollars, do everything in the contract in milli-dollars and only do the conversion to USD in the UI. I think this is also what you're advocating. As you're suggesting, it feels icky to do anything involving parsing stuff in the contract.
But the upshot of this is that when you interact with a contract, you need to know what the question specifies in terms of how it will interpret the data. If you've asked for a uint256
expecting 4 decimals, and the question actually asks the oracle to supply a uint256
using 13 decimals, you're going to have a bad day. I think this means that you can't usefully protect users of a contract by supplying data with different function signatures to distinguish times they want a bool
from times they want a uint256
from times they want an int256
, because they still need to look to the specific question asked to find out what kind of uint256
it is.
In other words, you need to distinguish different types of data, and it might be useful to have a common understanding of what they are, but that doesn't map cleanly to Solidity types, and often the consumer contract (as opposed to its user / UI code) won't need to know what the data type is either, so it seems simpler to deliver everything as a bytes32
.
There shouldn't be a conversion gas cost, as the EVM doesn't have a conception of "type" in its memory, just that the Solidity compiler enforces type semantics, and there are EVM opcodes which assume a piece of memory is typed in some way and does an operation accordingly. I think the gas difference is an accident of function selector order and maybe implementation details regarding stack memory use. For example, Remix reports that in the following version of the GasDemo, funcA
is actually the more expensive function by 2 gas (probably funcA
is the function selector being checked after funcB
, and the temporary uint u
was removed):
Fixed-point is a popular way of dealing with numbers that may have fractional parts, and whether it is binary or decimal, and how many fractional bits/decimal places get encoded often depends on the use case (or maybe on a whim ¯\_(ツ)_/¯). Still, I would say no matter what the details of the encoding, it is more efficient than using a string to represent numerical values on the blockchain.
Gnosis' use case for oracles is limited to the "one-of-many possibilities" and "signed integer (possibly with fixed-point)" result representations listed by @edmundedgar (I am considering stealing that list for the proposal). In this use case, the ID in the proposal corresponds to an IPFS document which specifies what the oracle reports on and how the result should be interpreted.
@MicahZoltu I know that I mentioned potentially cutting out a byte from the ID to support reporting up to 256 words, but does it make sense for Augur to report values via something like this proposal?
I'm wondering if it would make sense to standardize both push and pull type oracles in this EIP. If so, the terminology for oracle should be refined, as @se3000 notes.
Still, I am shy of using the term "oracle contract", as the oracle may just be somebody with an ordinary Ethereum account. Maybe it should be "push-type oracles" and "pull-type oracles"?
I am using "oracle" in the spirit of the definition "a priest or priestess acting as a medium through whom advice or prophecy was sought from the gods in classical antiquity."
@cag One thing we haven't really discussed here is how the question is formatted, except that it has a question, a type and may specify decimals - in the Gnosis context, what the content of that IPFS file looks like.
I don't know if that's too specific for the EIP which currently mainly talks about how the data is delivered, but I'd at least like to make sure that Reality Check supports something that Gnosis supports, or vice versa.
Augur's native output is an array of numbers where the length of the array is the number of possible outcomes (not including invalid, though in hindsight I think invalid should have been its own outcome) that sum up to a number the market creator chooses associated with the market. This results in each outcome receiving a fraction of the winnings proportionate to the reported number for that outcome divided by the number associated with the market. e.g., [7500, 2500]
means that holders of share 0
receive 7500/10000
while holders of share 1
receive [2500/10000]
, or 75% and 25%.
That being said, if someone wanted to create an adapter contract that converts between Augur native output and this it wouldn't be particularly hard, assuming the denominator (10,000 in the example above) times the number of outcomes is less than 2^256. Depending on how close you get to 2^256, you may have to resort to some clever bit packing, but in theory you could compress the result down to a single 256-bit value.
I'm sorry for the late response! Life's been... kinda crazy for me lately. Anyway...
@MicahZoltu Congrats on the Augur launch!
Correct me if I'm wrong, but I'm guessing that Augur results really only make sense in the context of this EIP if we're talking about a single universe right?
Additionally, in order for results to be interpreted in the way that was suggested by @edmundedgar, there would have to be an adapter which converted stuff like [0, 1] -> 1, [0, 0, 0, 1, 0, 0] -> 3, and [2500, 7500] -> 2500 (for Y/N, categorical, and scalar with four digits of granularity respectively)? I'm asking to see if this is possible, i.e. whether adding that list of suggested interpretations of the EVM word would still accommodate Augur's eligibility (outside of the scenario where a single word cannot describe the final state of the result).
@se3000 I've reread your comment and realized that there might have been a miscommunication! So in this spec, the oracle handler might map more correctly to what you are referring to as a consumer. I'm wondering if this is a fault of the terminology not being as readily apparent.
I'd like people's opinion on whether OracleHandler
may be better called an OracleConsumer
or something along those lines.
@edmundedgar About the IPFS file format for the description of what the oracle reports on, I personally think that it's out of scope for this EIP. Also, certain oracles (purely on-chain oracles, for example), may use a completely different ID strategy. For example, an oracle contract which reports on, say, blockchain difficulty for a certain block, may use the block number as the ID for the report.
I'd like to incorporate your list of result interpretations as suggestions in the EIP: I think there are too many ways to structure the result for that list to be considered exhaustive - e.g. consider a case where you have 4 uint64s in order to describe a value in 4D space. Maybe that's a bit far-fetched though.
One little task:
I like "OracleConsumer"
@edmundedgar About the IPFS file format for the description of what the oracle reports on, I personally think that it's out of scope for this EIP. Also, certain oracles (purely on-chain oracles, for example), may use a completely different ID strategy. For example, an oracle contract which reports on, say, blockchain difficulty for a certain block, may use the block number as the ID for the report.
Yes, that's something we saw at the workshop. Basically nobody else is involved in structuring information (as opposed to structuring where information comes from), so in practice even if we "standardize" it nobody except us will be using the "standard", so probably better to keep it out of a process for now.
I'd like to incorporate your list of result interpretations as suggestions in the EIP: I think there are too many ways to structure the result for that list to be considered exhaustive - e.g. consider a case where you have 4 uint64s in order to describe a value in 4D space. Maybe that's a bit far-fetched though
Yes, using them as suggestions makes sense. It certainly doesn't describe all possible cases, and our system is also designed to be extensible so you're not constrained by that list. BTW we've dropped the (signed) "int" case for now because describing its "null" case is hairy, maybe just leave that out.
Additionally, in order for results to be interpreted in the way that was suggested by @edmundedgar, there would have to be an adapter which converted stuff like [0, 1] -> 1, [0, 0, 0, 1, 0, 0] -> 3, and [2500, 7500] -> 2500 (for Y/N, categorical, and scalar with four digits of granularity respectively)? I'm asking to see if this is possible, i.e. whether adding that list of suggested interpretations of the EVM word would still accommodate Augur's eligibility (outside of the scenario where a single word cannot describe the final state of the result).
According to the contracts, the following is a valid reporting array: [0, 1500, 2500, 5000, 1000]
The reporting array is simply how to divide shares up among shareholders after the market ends. An example (not yet supported in the UI) for where the above may make sense is a market for "percentage of votes by presidential candidate". Users can go long or short on any candidate at any price, and make money based on how far they were in the right direction.
I got one more candidate: OracleReceiver
. Here's my reasoning behind the proposal:
OracleHandler
is too ambiguous: to handle does not evoke the right sense of what implementations of this interface should do.OracleConsumer
is closer, but I had a silly image of a creature eating the oracle in my mind. Joking aside, this also may suggest the producer-consumer problem. I don't believe this captures the entire space of possible implementations, as the producer-consumer problem assumes the existence of a message queue.OracleReceiver
may be bootstrapped off of the concept of receivers in information theory. This is the closest sense of what this interface should accomplish.@edmundedgar I don't quite understand what you mean by a null case for a two's-complement representation. It's my understanding that every integer from [-2^255, 2^255-1] is one to one and onto the space of possible EVM words.
@MicahZoltu Duly noted; I've not even considered the possibility of distributions being an output! With that said, the full general case I believe should still be supported, either with clever bit-packing or with indexing the ID space with the market address and outcome index if the numerators are too large for a single word. So yeah, that "value in 4D space" remark would make sense in this case.
This may also bolster changing the result type to an arbitrary-size bytes
.
@cag The issue with the null case is simply that the answer to a lot of questions are either a number or "We couldn't decide" or "This question didn't make sense". We hack this with boolean, uint or multiple-selection types by denoting that the final value, 0ffff....ffff, represents "invalid", and marginally shrinking the range of numbers we can represent. But if we want to handle negative numbers, that value comes back as "-1", which is an important part of the range you'd normally want to use. So for int types we'd have to either have a different representation for "invalid" depending on the type (eg we could use the number right in the middle of the range, representing the smallest possible number in the range) or come up with a different scheme, and there are some arguments for different schemes, like the ability to set a custom range.
Alternatively we could have a different value for "invalid" separately from the result like Augur does, which is probably the correct way to do it, but this creates more complexity.
For now we just decided to drop representation of negative numbers, since probably nobody needs it, and worry about it later.
I believe there are some cases where the id
will not be used. For example, if an oracle contract is created for a single event it may have no need for an id
for that event. I think the design decision to include id
makes sense to cover both oracle contracts that cover single events and multiple events. Would it may make sense to standardize what should be passed as the id
when it is not used and a contract simply has a result
? It could be as simple as "0 is be passed in for the id
when the id
is not applicable, otherwise the function reverts."
After a few discussions with people, OracleConsumer
seems to be the rough consensus for terminology for what is previously known as an OracleHandler
.
@edmundedgar There is the possibility of just saying that the range of the results given by the oracle is [0, result_granularity), and the oracle consumer or user-facing application could just map that to whatever range those values should represent.
@cwhinfrey I wonder if this clause from the draft covers single-use oracles:
receiveResult MAY revert if the id or result cannot be handled by the handler.
Also, thanks for this implementation! I'll include it into the EIP draft at some point...
Ping @josojo to talk about bytes32
vs bytes
as the result type, and to link in the people who want to see this draft incorporate extra data in some way.
Also, I was talking with @InfiniteStyles and he pointed out to me that bytes
currently do not have a native Solidity deserialization method (https://github.com/ethereum/solidity/issues/3876). However, this is actively being addressed: https://github.com/ethereum/solidity/pull/4390
Web3 oracle workshop summary - 23/24 of July
During the oracle workshop organized by the web3 foundation in London, we also discussed this EIP. Participants of the discussion were representatives from oralize.it, realitykeys, chainlink, appliedblockchain, thomsenreuters, consensys and web3 foundation.
We agreed that such a standard would be beneficial and we should introduce one. However, it seems unlikely that we can find one standard that fits all use cases.
We agreed that the proposed standard
interface OracleHandler {
function receiveResult(bytes32 id, bytes32 result) external;
}
is sufficient in most cases and a very efficient standard. However, if larger data needs to send to the OracleHandler, it might be more convenient (and gas efficient) to provide the result as a bytes variable and not as a bytes32 variable. Also, metadata might be required for some oracle solutions. E.g. oraclize.it provides also authenticity proofs. For these cases, we are proposing a second function:
interface OracleHandler {
function receiveResult(bytes32 id, bytes result, bytes metadata) external;
}
The metadata could be handed in as a part of the results, but it will be cheaper gas wise to get them with a second variable than parsing the result every time. However, if metadata is not required, the costs for calling this function are higher with this additional metadata parameter.
This second proposal has the benefit that this interface is more flexible and overcomes many restrictions of the first proposal. Hence, the second proposal is more inclusive and forward-compatible than the bytes32 solution. The additional gas costs for the second proposal should be of the magnitude of only some 100 gas.
The consensus of the workshop was that the standard should support both methods. The bytes32 result solution was appreciated for its leanness. Other additional data as authenticity proofs could be checked in this setup by other contracts preprocessing the oracle data and only calling the oracleHandler after successful preprocessing. The 2nd proposed solution was appreciated as it is the most inclusive solution. It shines with flexibility and future-compatibility.
The participants also agreed that we need push and pull oracle interface standards, as both methods have unique selling points. The upper definitions are obviously only push interfaces. For pull interface, we agreed to the proposed standard. Additionally, returning bytes instead of bytes32 might be helpful as well:
interface Oracle {
function resultFor(bytes32 id) external view returns (bytes32 result);
}
interface Oracle {
function resultFor(bytes32 id) external view returns (bytes result);
}
We also mentioned that declaring these functions as view might be a restriction, which is not valid for all use cases.
However, if larger data needs to send to the OracleHandler, it might be more convenient (and gas efficient) to provide the result as a bytes variable and not as a bytes32 variable.
I am fine with changing the type of the result to bytes
contingent on decode
getting implemented in Solidity. I don't expect developers to roll their own encoding/decoding functionality every time they want to implement this standard though, so last call for this EIP is postponed until at least then.
bytes
[It] will be cheaper gas wise to get [the metadata] with a second variable than parsing the result every time. However, if metadata is not required, the costs for calling this function are higher with this additional metadata parameter.
This is probably the main trade-off for adding the additional metadata
parameter. In this case, is the savings from having to do something like data, metadata = decode(result, (bytes, bytes))
worth defining the extra parameter?
As noted earlier, this is already a tradeoff from just directly interpreting a bytes32
as a uint
, but while I can see the use case for that tradeoff, adding the additional metadata
parameter seems like it is adding another parameter which most people will ignore, and as is noted:
The metadata could be handed in as a part of the results
Also pull oracles would have to be modified to either have an additional function metadataFor(bytes32 id) external view returns (bytes metadata)
or to have resultFor
return a pair of bytes if both they and this proposal get incorporated into this standard.
Ping @D-Nice to discuss this more.
Also, I want to note that single-use oracles, like what @cwhinfrey proposed earlier in this thread, already would ignore the id
parameter, so taking the previous argument further might mean that oracle consumers should just expect to receiveReport(bytes report)
from an oracle, and then the consumer would just decide whether or not to accept the report and what to do with it. This may complicate matters for pull oracles though (like how would a specific report be selected?).
It seems like there is a lot of demand for pull oracles.
One of my concerns in potentially adding pull oracles to the standard is the possibility of fragmenting the ecosystem for standard oracle consumers. What this means, practically speaking, is that oracle consumers may have to support both waiting to hear the result from an oracle and reaching out to an oracle and pulling the result from the oracle.
If we move the burden of implementation onto the oracle, then we'd have to pick a single interface to standardize around. If the push interface is chosen, then all pull oracles would also have to have some sort of function reportTo(OracleConsumer recipient, bytes32 id)
, but if a pull interface is chosen, then the oracle has to be a contract (no more ordinary accounts being oracles).
In any case, it's possible to specify both to some level of mandatory-ness, but then, there are a few scenarios:
In some of the pros and cons listed above, standard is emphasized because I anticipate the use of this standard would very much include adapters for existing oracles, and for those oracles, another contract containing code to adapt the existing oracle to either a mandatory pull or push interface would have to be deployed anyway.
Anyway, if there is enough demand for also including the pull interface into the standard, my vote is for scenario (3).
We also mentioned that declaring these functions as view might be a restriction, which is not valid for all use cases.
Curious as to what those use cases may be...
@josojo thanks for the summarizing. I agree with @cag on the metadata point. I actually thought that the majority at the meeting were in favor of leaving it off, although there were certainly a couple in favor of it. We discussed the bytes
being a flexible type that could be decoded manually for now, and possibly automatically if abi.decode
lands in solidity v0.5. Given its flexibility, metadata could easily be included in the result parameter.
As discussed, I put together some numbers around gas costs for including the extra parameter as opposed to parsing it. Full disclosure: I've seen some fuzziness in truffle gas estimations in the past, but it has been off on the order of 10s, not 100s. (PRs welcome!) Based on those examples, it looks like including an extra bytes type metadata parameter is actually roughly +600 gas per request. Alternatively, parsing a single bytes array into two bytes arrays is ~300 gas.
Not only is parsing bytes cheaper, but I think it is more fair to the users of the standard. A required metadata parameter would raise the cost of using the standard for all that do not need it, as opposed to putting the (lesser) cost on those that choose to use it.
Regarding the workshop, I do not think it is fair to assume the majority were against it. I only noted one party against it, two for it (Oraclize being one of those), and the other two parties either indifferent or abstaining from it, albeit one of the abstaining has a definite usecase for the metadata as well, at least if I followed their presentation correctly.
@se3000 thanks for working on the gas cost tests and bringing them forward. I have expanded on your work with some more comprehensive test cases, and slightly more realistic results/metadata, in most cases I think we can agree that the elements won't be a single byte for both. The link to them is here: https://github.com/D-Nice/gas-tests
To give a quick overview, it replaced the single byte result + metadata assumption, into two identical IPFS multihashes (most Oraclize proof bytes are IPFS multihashes, and it was convenient to use as the result for something greater than a single word size). From our expanded upon tests, even with the most ideal and unrealistic parsing solution, where we expect a consistently formatted array of 2 elements with 2 word sizes, the parsing costs ~300 more gas than just using results + metadata (proper non-naiive parsing would of course amplify this much more). The increased cost for non-metadata utilizing Oralces is more consistent at 600 gas, however, they also have the option of using the bytes32 single element variable as an alternative, while metadata Oracles have no alternative, hence us requesting 2 bytes.
You can check the tests for more specificity, but at no point do metadata Oracles gain any competitive advantage over their non-metadata counter parts. In fact, we'd be putting ourselves at a disadvantage by going with the single bytes solution as it would cost ~3000 more gas for us than non-metadata Oracles (this doesn't even account for the additional computations that may be needed to handle or store certain proof types), whilst the result + metadata solution saves us ~300 gas at least and non-metadata Oracles will still be more efficient by ~2000 gas even with this implementation. If a compromise can't be made over this, then maybe our groups are not ready for a standard.
I will look to list the Pros and Cons I see of the result + metadata solution.
more efficient for metadata Oracles utilizing it (metadata Oracles as is will have higher on-chain costs to use due to this feature, thereby non-metadata Oracles still retain an on-chain efficiency advantage).
allows for results to be consistent across any Oracle type (if every Oracle provides their result completely differently, there is no point in a formal standard, and someone might as well just come up with a wrapper to try uniting production Oracles when they need them).
allows for extensibility and future proofing for ABI decoding and other features (non-metadata Oracles may eventually find some useful on-chain verification mechanism to include in the metadata, encoded ABI bytes could be defined in the metadata parameter, signatures for the included data can be included).
sunken cost for non-metadata Oracles (they do have an alternative with utilizing the even more efficient bytes32 alternative if that is a worry).
awkward unutilized field for certain non-metadata Oracles
Of course, please do assume I am being biased, and feel free to populate the PROS and CONS as you may see them.
I tried golf-gas-testing the proposed metadata addition. I wrote the following test contracts:
https://gist.github.com/cag/ca2b2046c75bd1b001e45c16e3890226
I also used the following test data:
Input type | Test contents |
---|---|
Single word result | 0x6c65726e65726372616d626f73616e646c61726b6f6662616279626f6f6d6572 |
Dynamic result (3 words/96B) | 0x0000000000000000000000000000000000000000000000000000000000000060 64656361676f6e73756e617573706963696f7573636f7272656374696f6e636f77796f7574736d617274696e67726569737375696e6762726f6f6d6c696b65646164726f636b646976696e6973656861737479686561726b656e656468696e74 |
Single word meta | 0xc9f5e4290844e94101ba068b4858d916e0f10cb7cf1fbbe4f3a1f286fed0a9da |
Dynamic metadata (rsv-length/65B) | 0x0000000000000000000000000000000000000000000000000000000000000041 ebb1ee12c4c189b3ea9cc3c6ef6e0a07772e20518045ba5a1c61c280897e51e71f2cd48ec806af44f509872db983b6d4f59af61ca55534ee155ee5b246af6d37aa |
This led to the following results:
Scenario | Single param TX cost | Double param TX cost | Winner | Delta |
---|---|---|---|---|
Single word result, no metadata | 25165 | 25555 | Single | 390 |
Single word result, single word metadata | 27654 | 28108 | Single | 454 |
Single word result, dynamic metadata | 31902 | 31309 | Double | 593 |
Dynamic result, no metadata | 30598 | 30988 | Single | 390 |
Dynamic result, single word metadata | 33785 | 33550 | Double | 235 |
Dynamic result, dynamic metadata | 37870 | 36742 | Double | 1128 |
Some of the contracts I've written can probably be golfed more, but yeah, check it out.
@D-Nice it looks like the main the difference you're seeing when comparing the price with metadata to the price without metadata is due to the gas pricing of Ethereum, not the difference in proposed interfaces. Typically in Ethereum usage costs more, and that's always the case when sending data in a transaction. Changing that is probably outside the scope of this PR, so I think it'd be more insightful to stick to 1:1 comparisons.
[Pro: metadata] allows for results to be consistent across any Oracle type
Metadata is inconsistent and oracle specific. I think this would make responses less consistent then if metadata was handled before reporting the result to the oracle. More on this later.
[Pro: metadata] allows for extensibility and future proofing for ABI decoding and other features
How so? There would be another field. What about that is more or less future proof for ABI decoding? My example already decodes 2 bytes
arrays, and additional optional fields could easily be added.
[Con: metadata] is a sunken cost for non-metadata Oracles
This is not a sunk cost. It is an ongoing cost that is placed on non-metadata oracles on every request they send. If anything it seems more like an externalized cost.
they do have an alternative with utilizing the even more efficient bytes32 alternative
The method we're discussing is being added to handle requests that need more than 32 bytes of data. This alternative will not work.
@cag thanks for checking the numbers, very insightful. I'd previously assumed that it was a fixed cost because I'm only assigning pointers to preallocated data, but it looks like there's a gas cost in there that I'm missing and sending more data raises the parsing cost. I dug into "Dynamic result, dynamic metadata" example and couldn't help but golf, because who doesn't love a good optimization? I wrote an alternate version that moves pointers, the parsing cost is about half of your example. Notably, as more data is passed in the cost grows more quickly when allocating new memory than when moving pointers. Doubling the length of both inputs consumes ~1200 more with new memory, as opposed to ~300 more with pointers.
Gas details aside, I continue to believe that the cost of optional fields should be put on the people using the fields, not externalized on the non-users.
Taking a step back, there was a question only briefly touched on at the workshop, which I think it'd be helpful to hear from the wider community: should an Oracle send data that it can see is false? It seems to me that it should be the Oracle's responsibility to ensure the data they are sending is correct, so if there is a proof to run on-chain, they should verify it before passing along data.
If the Oracle doesn't process the metadata, and pushes that work on to the consumer, then the Consumer loses the interoperability that this standard aims to achieve. Hypothetically, Chainlink decides to use metadata and sends the m-of-n ratio of oracles reported over requested. A Consumer wants to use both Chainlink and Oraclize interchangeably. They now have to deploy both Oraclize metadata checking logic and Chainlink metadata checking logic. Apart from the inefficient duplication of contract code in every Consumer Contract, what happens if a third metadata oracle shows up? The Consumer Contract has to be redeployed, or they can't use the new oracle. Same thing if one oracle updates their metadata format/verification. Obviously redeployment is non-trivial for many dapps. Also, how does the consumer differentiate which type of metadata they're receiving? Maybe we should add a metametadata field? 😉
Metadata seems intrinsically proprietary, or at the very least non-standardized. For this reason, it seems like metadata should be handled by the oracle before reaching the consumer, or by some higher level on-chain proxy. Consumers should only have to concern themselves with receiving the data and handling it, otherwise they lose interoperability and become dependent on a limited number of parties.
@se3000 Thanks for the optimization! I see that it uses the same implementation strategy as what you've originally posted (my derp). I'll update the table accordingly...
With that said, it seems the dominant gas cost comes from the size of the data itself, and not from the parsing of said data.
If the Oracle doesn't process the metadata, and pushes that work on to the consumer, then the Consumer loses the interoperability that this standard aims to achieve.
Arguably, this interoperability does not exist in the first place, as the result format isn't specified.
should an Oracle send data that it can see is false?
Right now, the spec indicates:
receiveResult MUST revert if receiveResult has been called with the same id before.
In line with this requirement, I vote that any adapters for systems using proof metadata should verify the proof before making a report.
That said, I do see a potential use case for removing the aforementioned requirement from the spec. What if the consumer needs to punish entities making wrong reports about a subject? That's a case in which the consumer would have to receive the proof metadata as well as report on the same id multiple times.
Still, I also don't see mechanisms for ensuring data quality (e.g. punitive measures) as the oracle consumer's responsibility, but rather a topic which the oracle should address.
In general, any oracle system which requires additional on-chain processing of metadata should justify the inclusion of that cost by simply producing better data. Oracle consumers will want the best data, and the gas cost differences between parsing one or two bytes
parameters will pale in comparison with the gas cost of processing proof metadata and its value proposition of delivering better data.
Anyway, I'm still not sold on adding the metadata
parameter.
@se3000 A small note about your optimization:
contract DynamicDynMetaConsumer {
event LogStuff(bytes b, bytes m);
function receiveResult(bytes res) external {
bytes memory resCopy = res; // <- See this line
bytes memory b;
bytes memory m;
assembly {
b := add(resCopy, 0x20)
m := add(add(b, 0x20), mload(b))
}
emit LogStuff(b, m);
}
}
The latest version of Solidity does not automatically copy dynamic types from calldata to memory for a given symbol, instead leaving this up to the contract writer to do so. Still, pointer moving after the data has been copied is the most efficient way I've seen for handling this case.
I'd also like to remark that the single word cases for both results and metadata readily generalize to fixed format cases, in that their efficiency only relies on the fact that the memory layout in calldata is completely known up front.
I'd also like to remark that the single word cases for both results and metadata readily generalize to fixed format cases, in that their efficiency only relies on the fact that the memory layout in calldata is completely known up front.
Great point. In the workshop, I believe that we pretty much unanimously agreed that the format is determined by the request, and that format is fixed for the response after the request. This pretty much lines up with fixed format, and I'd propose we stick to fixed format for the discussion at the moment. I'm open to non-fixed format, but don't see a burning need yet. And, it may be easier to sort out once we've nailed down some of the other moving pieces.
Hello, I'm a bit late but I recently discovered this EIP and got an interest for it. I'm basically building an Oracle system which aims at providing the result of offchain computation to the blockchain. There is a big protocole to ensure correctness of the results. I want to make it EIP 1154 compliant :)
At the end of the protocole, a finalize
function is called . This the point where the result becomes available (through resultFor
). It is also the point where we can do a callback (if the user requested it) using the receiveResult
API. To avoid an user being able to deny a finalize
, I have to make sure that the call to receiveResult
doesn't revert the finalize under any circonstances.
my code is as such:
function finalize(...) {
// ...
if (callbackTarget != address(0))
{
/**
* Call does not revert if the target smart contract is incompatible or reverts
*
* ATTENTION!
* This call is dangerous and target smart contract can charge the stack.
* Assume invalid state after the call.
* See: https://solidity.readthedocs.io/en/develop/types.html#members-of-addresses
*
* TODO: gas provided?
*/
require(gasleft() > 100000);
callbackTarget.call.gas(100000)(abi.encodeWithSignature(
"receiveResult(bytes32,bytes)",
_taskid,
_results
));
}
}
The 100000gas value has been set arbitrarily. I was wondering if we should enforce an upper bound on the gas consumed by this function. Without such a bound my application would really struggle :/
@Amxx I would say that an upper gas bound might not be a bad idea.
Regarding this EIP draft, I've been increasingly questioning whether is fundamentally any value in making this EIP. The issue is that any value in the standard gained would have to be gained through some form of interoperability, yet this EIP doesn't actually provide said interoperability.
The only things this EIP actually specifies is that there is a function which may be used (resultFor
) to receive a result for something identified by an ID word. I'll reiterate what @edmundedgar said earlier in this thread about standardizing the structure of result
:
...Basically nobody else is involved in structuring information (as opposed to structuring where information comes from), so in practice even if we "standardize" it nobody except us will be using the "standard", so probably better to keep it out of a process for now.
This opaque result parameter whose structure we can't specify, but whose value comes in part from knowing its structure... if we continue further along trying to implement this, we may reinvent something like the Ethereum ABI codec. This, paired with some bytes used to identify this information...
If the consequence of this EIP would be that adapters between systems which implement this have to be created anyway, then we might as well go to plain function calls. It seems wasteful to specify something only to force proprietary interpretions of data on oracles anyway, and have silly things like putting ABI encoded stuff in data only to unwrap it and reinterpret it before putting it in a different format. At that point, just have the proprietary endpoint and use plain function calls.
Anyway, I feel now that opening this was a mistake, so if there are no objections, let's close this EIP. I dunno if there is a sort of last call procedure for something like this though.
As an implementer of this EIP I really don't think this was a mistake. I might be the only one providing this interface but I'll continue doing it.
It's the end user job to know which data type he asked for, and to decode the resulting bytes accordingly !
This is the official discussions thread for EIP #1154. The draft can be read here.