hyperledger / firefly

Hyperledger FireFly is the first open source Supernode: a complete stack for enterprises to build and scale secure Web3 applications. The FireFly API for digital assets, data flows, and blockchain transactions makes it radically faster to build production-ready apps on popular chains and protocols.
https://hyperledger.github.io/firefly
Apache License 2.0
498 stars 204 forks source link

Intermittent e2e test failures when adding a new namespace #1545

Open EnriqueL8 opened 2 months ago

EnriqueL8 commented 2 months ago

There is an intermittent test failure such as this one https://github.com/hyperledger/firefly/actions/runs/10045043332/job/27761545128?pr=1544

After investigating this intermittent issue, there is a race condition in the test between Docker and Test execution. As part of the test it update the FF Config to add a new namespace, that file is mounted into the FireFly container in Docker. After adding a new namespace it will call the /spi/v1/reset API for FF to restart with that new config. If the volume mount hasn't picked up that new configuration then it will not start the new namespace and thus the test will fail when checking for the status of that namespace as such:

    restclient.go:109: 2024-07-22T17:01:05.89518895Z: ==> GET /namespaces/e2e_65737b8a1b-C/status map[]: null
    restclient.go:109: 2024-07-22T17:01:10.89563318Z: ==> GET /namespaces/e2e_65737b8a1b-C/status map[]: null
    restclient.go:109: 2024-07-22T17:01:15.896077617Z: ==> GET /namespaces/e2e_65737b8a1b-C/status map[]: null
    restclient.go:109: 2024-07-22T17:01:20.900465937Z: ==> GET /namespaces/e2e_65737b8a1b-C/status map[]: null
    restclient.go:109: 2024-07-22T17:01:25.904224656Z: ==> GET /namespaces/e2e_65737b8a1b-C/status map[]: null
    restclient.go:109: 2024-07-22T17:01:30.905116092Z: ==> GET /namespaces/e2e_65737b8a1b-C/status map[]: null
    restclient.go:109: 2024-07-22T17:01:35.909481946Z: ==> GET /namespaces/e2e_65737b8a1b-C/status map[]: null
    restclient.go:109: 2024-07-22T17:01:40.910588275Z: ==> GET /namespaces/e2e_65737b8a1b-C/status map[]: null
    restclient.go:109: 2024-07-22T17:01:45.911013176Z: ==> GET /namespaces/e2e_65737b8a1b-C/status map[]: null
    restclient.go:109: 2024-07-22T17:01:50.911431512Z: ==> GET /namespaces/e2e_65737b8a1b-C/status map[]: null
    restclient.go:109: 2024-07-22T17:01:55.913719974Z: ==> GET /namespaces/e2e_65737b8a1b-C/status map[]: null
    restclient.go:109: 2024-07-22T17:02:00.918162459Z: ==> GET /namespaces/e2e_65737b8a1b-C/status map[]: null
    e2e.go:71: 
            Error Trace:    /home/runner/work/firefly/firefly/test/e2e/e2e.go:71
                                        /home/runner/work/firefly/firefly/test/e2e/multiparty/multi_tenancy.go:117
            Error:          Received unexpected error:
                            Get "http://127.0.0.1:5001/api/v1/namespaces/e2e_65737b8a1b-C/status": dial tcp 127.0.0.1:5001: connect: connection refused
EnriqueL8 commented 3 weeks ago

Have hit this again multiple times, I think the ideal fix would be to use the reload config watcher instead of calling this reset and for the test to wait for things to spin back up... We had deprecated this reset API a while back