Closed donnlee closed 1 month ago
Thank you for reporting this. There was an endianess issue with parsing the input count in a tx, which reared its ugly head with a very large set of inputs.
Released version 1.4.5 which fixes this issue. Let me know if you run into any further problems.
You rule. All is good now with 1.4.5. Thank you for the awesome lib!
Oh, I might have found another one. Error report coming.
Getting similar error when i try to decode a script with:
const script = Script.decode(scriptAsHexString)
txid: 9b273c9880fcfa8da4cf5d202710b546f9c2f67b1a203d58aefd97964100ba72
Error:
file:///home/donn/workspace/gitlab.com/proj/inscribe/node_modules/@cmdcode/tapscript/dist/module.mjs:1069
throw new Error(`Size greater than stream: ${size} > ${this.size}`);
^
Error: Size greater than stream: 4113302305 > 74
at Stream.peek (file:///home/donn/workspace/gitlab.com/proj/inscribe/node_modules/@cmdcode/tapscript/dist/module.mjs:1069:19)
at Stream.read (file:///home/donn/workspace/gitlab.com/proj/inscribe/node_modules/@cmdcode/tapscript/dist/module.mjs:1075:28)
at decodeWords (file:///home/donn/workspace/gitlab.com/proj/inscribe/node_modules/@cmdcode/tapscript/dist/module.mjs:1409:35)
at Object.decodeScript [as decode] (file:///home/donn/workspace/gitlab.com/proj/inscribe/node_modules/@cmdcode/tapscript/dist/module.mjs:1378:12)
at main (file:///home/donn/workspace/gitlab.com/proj/inscribe/hello_repro_decode_txn_error_for_publishing_to_issue_decode_script.js:34:25)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
Node.js v19.9.0
In that tx, scriptAsHexString
is:
4e21032cf515e2ae74b6d639c4e91a0b2a7047a0178e628d167989af46d1da474a6951ad21022d6b369d9a95568203b2a51eac49cc8b20ab81930e2d2565df5f0bc8e3bf59d5ac73640380ca00b268
this is the raw tx:
020000000001044a09bb8a75db2e360b8ef45e67b00e4e81864682c86d68b432f72123341b1146000000002322002042ac202bdb3601126e3325c2cabb142274da06209f47eb595a4c3e587fe8fe5bfdffffff2c1abe84242293c61441fa7cb5dd10917764b6bd00e0c5b0c97a03eee019f931180000002322002003cfb25e10b99f9583e16b826038cc0626abc09317fb161d74a54e39bea8b0d7fdffffff09c2ebf90f94e3638de23ef07d2130df43a1301c44d4aed7fd572f46857f1da2000000002322002042ac202bdb3601126e3325c2cabb142274da06209f47eb595a4c3e587fe8fe5bfdffffff2c1abe84242293c61441fa7cb5dd10917764b6bd00e0c5b0c97a03eee019f9310d0000002322002003cfb25e10b99f9583e16b826038cc0626abc09317fb161d74a54e39bea8b0d7fdffffff01bfdd94000000000017a91416d4ccd3475f6754c8e143762da83bdaaa8afb80870347304402201259511741fc8262c509475f7746ab607f3d11bbf33f22b87beec527e159b8130220370f0e3487cc72ce7387d255d40eda26d498a7079a4afe32ec749b4ae102027201473044022034aa7d0524c8fee7228a75880489b13239dbb9af4854aa30ba4e66fe2ebc648a02206f7bec4d914ddf6f03159de8253f380b07a98b09b4e0ed5f1d9f1ffca771d02b014e21032cf515e2ae74b6d639c4e91a0b2a7047a0178e628d167989af46d1da474a6951ad21022d6b369d9a95568203b2a51eac49cc8b20ab81930e2d2565df5f0bc8e3bf59d5ac73640380ca00b26803473044022042584b95f101f8e09081ec8a4ce103b6a160ee144621cc1ec44300e199bf53d702206ca84eda2b54ece36274852aa8a221b1a7e109af0c85e853576134ac5b19707001473044022052d312808d7c0f79353f8f06b4bde16356033db8a29fbc422a4270ef7b6bca1f022037776cb91a73f9f4e82b8758331d04b116b2d156e9cba8b13032ad2c28c0ea08014e2102c58d36a062a688338cce2cd768e8158d13f589a0cfe76cff772ef545133f7635ad210313f87616d06f8345a2ae67e865824229f0ea2808dd48fc5359d75154e3116829ac73640380ca00b2680347304402202f1995ef920e717b5e58c1f01e1397645e2be3d77d5d87d235147539848a230e0220562b7419f770c051dbe6e9aa8b67cbe1c67571e7ade8514bc5290990687a4cce014730440220346bc4f853c2de23b8d224440e87def9b95c99500c88c2d45715428128d9555402203fe5d447e4031c9ded575a82a25606ca57a176445e3decfb0fe082c1e446bbe8014e21032cf515e2ae74b6d639c4e91a0b2a7047a0178e628d167989af46d1da474a6951ad21022d6b369d9a95568203b2a51eac49cc8b20ab81930e2d2565df5f0bc8e3bf59d5ac73640380ca00b26803473044022001556d2f42d3d03a3f155470f5696b8533ff5ad7c937d0b8dfc0c36fba74a1520220329af99245479c3bb60d547b254f4df8b96bf1e7c0658ac8789543379eb51f870147304402202d05e07133f7fb7def6011677588e4ca0917d180b6c80036cac22111c29eef6e02203af8fa8a4b12a380c8e42e51dece0867c1aec1aa613a731ecb7b2116d8356ef3014e2102c58d36a062a688338cce2cd768e8158d13f589a0cfe76cff772ef545133f7635ad210313f87616d06f8345a2ae67e865824229f0ea2808dd48fc5359d75154e3116829ac73640380ca00b26843af0c00
This is my repro script. I don't know how to isolate the witness script in fewer steps. So pls tell me if this is wrong.
import { Script, Tx } from "@cmdcode/tapscript";
import * as lib_bitcoin_rpc from './lib_bitcoin_rpc.js'
import * as lib_hexutils from './lib_hexutils.js'
async function main() {
let tx
// The following txid causes a decode script crash.
const txid = '9b273c9880fcfa8da4cf5d202710b546f9c2f67b1a203d58aefd97964100ba72'
// The following txid does not cause a crash.
//const txid = '314517d30f170fe74e39aeea8f85246330184900009ba5ea6b3a8d2c080746fe'
const { result: rawTxn } = await lib_bitcoin_rpc.getRawTransaction(txid) // Get as hexstring.
tx = Tx.decode(rawTxn)
console.log('Tx decode done.');
console.log(rawTxn);
const inputs = tx.vin
const input = inputs[0]
console.log(input);
const witWithNamedValues = Tx.util.readWitness(input.witness)
console.log('witWithNamedValues:', witWithNamedValues);
let scriptAsHexString
if (witWithNamedValues.script) {
const scriptUint8arr = witWithNamedValues.script
scriptAsHexString = lib_hexutils.uint8arrayToHexstring(scriptUint8arr)
console.log('scriptAsHexString:', scriptAsHexString);
}
const script = Script.decode(scriptAsHexString)
console.log('Done.');
}
main()
Getting similar error when i try to decode a script with:
The 4e
in front of the script is a size byte. If you drop that byte, the script will parse correctly.
I did update Script.decode()
in v1.4.6 so that if you pass true
as a second parameter, it should parse the script with the 4e
size byte. You can use this boolean to turn the size byte parsing on or off.
Let me know if this works. :-)
Ah ha! mempool.space doesn't show the size byte, so now I know I need to compare to the raw script in hex. Works great with Script.decode(foo, true) Thank you @cmdruid !
Hmmm, I'm getting a lot of these errors now:
--- blk 831442: txid cb3a8aa2ea68dd284c4902e502a2e0bf72c721c7dc23c7c4658b84591d1e3682
Error: script decode error: Error: Varint does not match stream size: 3380 !== 3382
Wondering if https://github.com/cmdruid/tapscript/commit/ebba2f5c86b22cca12d61efb530c4b67da0a0cd7 is causing this. It may be that sometimes the script hexstring contains the size byte and other times not. Will investigate.
ChatGPT said the size byte is encoded a different way if the size is >252
The general format for encoding the length of the witness script in hex is as follows:
If the length is 0 to 252 (0xfc in hex), it is represented directly as a single byte.
Example: If the witness script length is 42, it is encoded as 0x2a.
If the length is 253 to 65,535 (0xfd to 0xffff), it is represented as 0xfd followed by a 2-byte little-endian integer.
Example: If the witness script length is 500, it is encoded as 0xfd, 0xf4, 0x01 (little-endian representation of 500).
I see this in this txid that is throwing Varint does not match stream size: 501 !== 503
:
b5cadf0f746b6a899de64e21e938a519015b6f12b25d5ce9d0b37931c22da5ce
And yes, this script begins with fd
:
fdf50120cf2f6edeef046f8ae6c2a6d4306dffef00b5250108d0dfda3d383fcd2c638d6cac0063036f7264010117746578742f68746d6c3b636861727365743d7574662d38004dae013c21444f43545950452068746d6c3e0a3c68746d6c206c616e673d22656e223e0a3c686561643e0a20203c6d65746120636861727365743d225554462d3822202f3e0a20203c6d657461206e616d653d2276696577706f72742220636f6e74656e743d2277696474683d6465766963652d77696474682c20696e697469616c2d7363616c653d312e3022202f3e0a20203c7469746c653e416273747261637420417274202d2041746f6d6963204d6f64656c206279206f726442616e6b73793c2f7469746c653e0a3c2f686561643e0a3c626f6479207374796c653d226d617267696e3a20307078223e0a20203c6469763e0a202020203c696672616d65207374796c653d2277696474683a313030253b206865696768743a31303076683b206d617267696e3a3070783b20626f726465723a6e6f6e653b22207372633d222f636f6e74656e742f303036636332306462356363633433353536323164623336353766653063326631303534353266313762326136323863376634613166653133373233323634356930223e3c2f696672616d653e0a20203c2f6469763e0a3c2f626f64793e0a3c2f68746d6c3e68
Does it make sense for this lib to handle this case?
If not, I can check for fd
in my code and slice away the first 3 bytes.
I guess Tx.util.readWitness(input.witness).script
always includes the size bytes (plural now). So at least it's deterministic and I can handle that
and this is what chatGPT said about even larger sizes:
If the length is 65,536 to 4,294,967,295 (0x10000 to 0xffffffff), it is represented as 0xfe followed by a 4-byte little-endian integer.
Example: If the witness script length is 70,000, it is encoded as 0xfe, 0xb8, 0x1b, 0x00, 0x00 (little-endian representation of 70,000).
If the length is greater than or equal to 4,294,967,296 (0x100000000), it is represented as 0xff followed by an 8-byte little-endian integer.
Example: If the witness script length is 5,000,000,000, it is encoded as 0xff, 0x20, 0x8d, 0xe5, 0xbd, 0x00, 0x00, 0x00, 0x00 (little-endian representation of 5,000,000,000).
These encoding rules allow for a compact representation of the witness script length, adapting to the specific requirements of the length value. The encoded length is then followed by the actual witness script data in the transaction's witness field.
Ref: https://bitcoin.stackexchange.com/questions/110808/reference-to-segwit-raw-transation-format
CompactSize: serialization of an unsigned integer in a variable number of bytes:
0 ≤ n ≤ 0xFC: serialized as [n] directly (one byte). 0xFD ≤ n ≤ 0xFFFF: serialized as [0xFD] + LE16(n) (3 bytes). 0x10000 ≤ n ≤ 0xFFFFFFFF: serialized as [0xFE] + LE32(n) (5 bytes). 0x100000000 ≤ n ≤ 0xFFFFFFFFFFFFFFFF: serialized as [0xFF] + LE64(n) (9 bytes). This isn't actually used, as no structure this big would fit in a block.
This is more for my future reference, to this github issue.
Are you still having parsing issues?
Thank you for asking. I'm good after I wrote a function to remove the size bytes based on the spec (all cases of VarInt). My func is:
function removeSizeBytes(scriptAsHexString) {
// Removes the leading size bytes from Tx.util.readWitness(input.witness).script
// We must remove the size bytes before we .decode(script)
// Size of the script (in bytes) is in "VarInt" format.
// scriptAsHexString: Script in hexstring WITH LEADING SIZE BYTES.
// https://github.com/cmdruid/tapscript/issues/33
if (!scriptAsHexString) return
// Examine the 1st byte (ea byte is 2 chars of a string):
const firstByte = scriptAsHexString.slice(0,2).toLowerCase()
// Rm first 3 bytes with: s.slice(6)
if (firstByte === 'fd') return scriptAsHexString.slice(6)
// Rm first 5 bytes: s.slice(10)
if (firstByte === 'fe') return scriptAsHexString.slice(10)
// If not 'fd' or 'fe', then the 1st byte is the length of the script. Remove it.
// Remove the 1st byte with: s.slice(2)
return scriptAsHexString.slice(2)
}
Any feedback on this code? I'm not a fan of slice()'ing unsafely, but the input data should be clean because it came from Tx.util.readWitness()
I apologize for the late reply. I am not a fan of slicing unsafely either. I see that you are looking at the first byte, which should always be a varint if you are handling hex data from the witness. I think you are safe with this assumption.
I like this library a lot, but I have hit a blocker.
Tx.decode()
fails with this error for some transactions that have already been confirmed. The error message is always the same:38133322 > 56518
for the txn in my repro, below.I notice this tends to happen with txns with many inputs, or many outputs.
Here's my repro script (node 19.9.0):
Output:
I hope this can be investigated because I would hate to port all my code to another lib. Please let me know if there's more info I can provide. Thank you