Closed koivunej closed 3 years ago
Build failed with the latest checkpoint-y commit but that is ok as work continues. I've removed the btp "differential fuzzing" in 68260fe. The timestamp parsing was found quite lenient while fuzzing the btp, so that'll be the next thing but I'm ready to call this good to go with the ccp huge preallocations fixed. I'll probably complete this work by creating a few issues on:
Vec::new()
Hardening doesn't seem to great with test-md failures. Lets see if I can decipher these.
It would appear that the "xrp-settlement" failed. From the artifact it'd appear that everything went smooth for alice but in node-bob-settlement-engine.log
:
$ less node-bob-settlement-engine.log
2021-01-29T09:58:02.213Z settlement-xrp Generated new XRP testnet account: address=rnGBkZRibgCZ6p3rVHHNVj2VmudXe3nedk secret=snscPxz9TQoDXMSN3fBekxaMk1Xt8
2021-01-29T09:58:02.723Z settlement-core Started settlement engine server
2021-01-29T09:58:12.944Z settlement-xrp Received incoming XRP payment: xrp=0.0005 account=5f3024b6-38d7-46aa-b32a-79cda329e30a txHash=19A0316F6C859266EDE0E2E4981A2AFEEB3DEE30622FF75EBF8F5AB4E1464182
2021-01-29T09:58:12.947Z settlement-core Notifying connector to credit settlement: amountToCredit=0.0005 account=5f3024b6-38d7-46aa-b32a-79cda329e30a settlementId=19A0316F6C859266EDE0E2E4981A2AFEEB3DEE30622FF75EBF8F5AB4E1464182
2021-01-29T09:58:12.955Z settlement-core Error: Connector failed to process settlement: amountToCredit=0.0005 account=5f3024b6-38d7-46aa-b32a-79cda329e30a settlementId=19A0316F6C859266EDE0E2E4981A2AFEEB3DEE30622FF75EBF8F5AB4E1464182
2021-01-29T09:58:12.955Z settlement-core Connector credited incoming settlement: leftover=0.0005 credited=0 amountToCredit=0.0005 account=5f3024b6-38d7-46aa-b32a-79cda329e30a settlementId=19A0316F6C859266EDE0E2E4981A2AFEEB3DEE30622FF75EBF8F5AB4E1464182
[ioredis] Unhandled error event: Error: connect ECONNREFUSED 127.0.0.1:6382
at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16)
[ioredis] Unhandled error event: Error: connect ECONNREFUSED 127.0.0.1:6382
at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16)
[ioredis] Unhandled error event: Error: connect ECONNREFUSED 127.0.0.1:6382
at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16)
The whole execution log from the log failure:
Testing "xrp-settlement" on source mode. [4/4]
rm: cannot remove '/home/runner/.interledger/bin/*': No such file or directory
Stopping Interledger nodes...
Building interledger.rs... (This may take a couple of minutes)
Finished dev [unoptimized + debuginfo] target(s) in 0.18s
Starting Redis instances...done
OK
OK
OK
OK
Starting settlement engines...
Starting Interledger nodes...
Failed to spin up settlement engine. Check out your configuration and log files.
Error running markdown file: ./README.md (parsed bash script /tmp/tmp.IqLUsVs1xo)
Waiting for Interledger.rs nodes to start up..............Some tests failed. [3/4]
I do wonder how come the line preceding Waiting for Interledger.rs nodes to start up
is after Failed to spin up settlement engine
. Also the failed to spin up settlement engine
comes right before exit 1
. Cannot see any subshell magic in the lib or run-md-test.sh
. Maybe the parser? Nope. The parser seems to just leave the necessary parts in and it is executed with bash -O expand_aliases "$TMP_SCRIPT"
where $TMP_SCRIPT
is tempfile.
Could be that the formatting issues are because of the strange output where \n
is used to prefix messages? Itching to get rid of these and just echo and variable substition these.
Need to get these running locally. I guess I have to spin up a vm. The scripts don't really look like they cleanup after themselves... So this could even be interference from previous test but probably not as these have been executing well.
Trying to bisect this with another branch.
Ok I guess the test is just flaky? Good three times (2 here, 1 in first step of bisection) in row, next one will be the third. Separating the CI stuff to a new PR.
Did the hopefully the final rebasing, made sure that the strict
is not enabled for the default cargo build -p ilp-node --all-features
builds -- enabling it would be in the spriti of OER notes (rfc 0030) but if someone is trying it against other implementations those might not be compatible.
test
failure which #701 would fix. Lets see what's the test-md
failure. It was the usual (see https://github.com/interledger-rs/interledger-rs/pull/693#issuecomment-808343941), now we should have packet capture. But the failure doesn't repeat today...
I'll merge this today evening or tomorrow after re-running it to see if there's some time-specific issue which could reproduce.
Dropped the Cargo.lock
files in the latest push, I cannot see any point in keeping them.
Test failure is "normal flaky", lets hope for a test-md
failure instead. "Normal flaky" error was:
thread 'three_nodes::three_nodes' panicked at 'error binding to 127.0.0.1:40675: error creating server listener: Address already in use (os error 98)', /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/warp-0.2.5/src/server.rs:213:27
This is because the tests do not use :0
but instead hope racing to listen and close a port. This failure happens surprisingly rarely.
Now two passes after the "normal flaky." Lets try a third.
Issue has been created, seems that we'll need to find another PR to trigger the occasional test-md
failure.
This builds upon #690. Some found fixes have already landed with #695. It adds on top of any fixes in earlier already:
strict
to interledger-packet for accepting only varuints which have no leading zeroesinterledger-packet/strict
in btp, stream at leastvar_octet_bytes
can now only be usize wide, which should probably be specced lower to match realityVec::new
which does have a chance of perf hitinterledger-packet/strict
is useful for roundtrip fuzzing where input is first parsed and then converted back into bytes. Wheninterledger-packet/strict
is not enabled any additional leading zeros are accepted for varuints. Accepting them but not outputting them when converting the packet back to bytes creates a lot of false panics when fuzzing.strict
shouldn't be enabled in any production usage because it did look like the java implementation for example didn't have such strict checks on parsedBigInteger
s.Most of this work came out of #680 where at updating
interledger-stream
I ran into the bad length check. Fixing the length check right away failed a lot of existing test cases, which got me to do differential fuzzing between two versions. Differential fuzzing of course runs into all of the previously found issues with roundtrip fuzzing. ~I will probably drop the differential fuzzing before making this ready for review.~ Dropped already.Per crate fuzzing targets and status:
#[ignore]
tests#[ignore]
tests~85% of added lines are caused by 4 added Cargo.lock files.~