Open lealife opened 6 years ago
Considerations:
Just so I understand this task well, I am supposed to implement a way that the op_return text can be decoded for the user on the frontend? Similar to this BTC implementation... Right?
Can someone help me narrow down all the possible text encoding I should look into, So far I've got hex but it seems like other encodings are being used too.
That is why this is a bit of a research project. There's UTF-8, ASCII, ANSI, etc.
To start, my suggestion is to look into how sites like https://cryptograffiti.info/ approach this (source: https://github.com/1Hyena/cryptograffiti).
Of course, the OPRETURN data can be anything_, so there's no way to know if you're decoding it right or if there is even a proper decoding. My suggestion is to look into golang libraries that aim to solve this problem for us. I feel strongly that this is not a problem we should attempt to solve. Surely there are libraries that scan binary files for recognizable data, which may be text or binary files like images and archives.
I'm thinking along the lines of:
Then there are the considerations of doing everything efficiently (memory and CPU).
Also, I suggest approaching this task in pieces. Perhaps look at the content detection / decoding issue first, then worry about the front end stuff later.
Taking into consideration that OP_RETURN does not allow huge data sizes to be stored, I am giving top priority to on the task to decode all its data into sensible UTF-8 text, I will look into file decoding next.
I think there is no need to detect the text encodings. DCRData should define a encoding (UTF-8) as the standard encoding, and all the users must follow the standard. Otherwise it will be so complicated and mess!! It should be a simple question.
Well, how should we enforce this rule? OP_RETURN data can be anything at all. Could be jpeg data. This entire issue is purely amusement.
Hey everyone,
My two cents: make it a byte limit of 260 bytes or something like that. If someone can make a multi-input/output or multisig with 500 bytes, a simpler tx in terms of outputs with a 260 byte op_code is ok (making the whole tx about ~500 bytes).
Litecoin has 40 bytes, a holdover from 2014. We made TradeLayer tx codes fit into that. Bitcoin moved to 80 bytes. Might as well be a little bit experimental for other apps that don't have the same rigor applied, idk. I like that we had to limbo in terms of design, but some things are counter-productive. For example, we add outputs for "reference addresses" that are then the destination for whatever is encoded in the OP_Return payload. It doesn't have to encode the address itself. The output itself adds about 30 bytes or so, so comparable to jamming the text into a big OP_Return payload. However long-term the savings are net-negative as each output must be redeemed with a ~120 byte signature. Segwit helps there. These little savings trade-offs may not matter all that much in the long-run. Let people try bigger OP_Codes why not.
@chappjc, is this still required?
I don't care, personally. But it has never been completed. It was started in https://github.com/decred/dcrdata/pull/700 with a more open-ended decoding, which would probably end up being a DoS vector, but abandoned. Then in https://github.com/decred/dcrdata/pull/934 with an ultra-simplistic approach that treats the data as utf-8 bytes.
You're free to tackle this, but:
explorer/types
(not api/types
or db/dbtype
), and ideally the code just to the internal/explorer
package. API consumers don't get this guesswork result, only web page views since it's almost always meaningless, redundant, and always trivia. Also, it has nothing to do with DB.Just see what happens with a dumb utf8 interpretation of the nulldata push.
I have created a tx and add a OP_RETURN output at https://testnet.dcrdata.org/tx/92e6e6e7d2877e105435787fd73ec572cc85e3e620332cbfd65478c6c20aa0e2
Decode the Hex data
48656c6c6f2c204465637265642e
to string (UTF-8) isHello, Decred.
Maybe the decoded data can be showed on this page
reference: https://www.blocktrail.com/tBCC/tx/3bd425901bb4ddf2684e1bb85a5b65714a5835d78addf7b01c3aa674bacd8a4c