JoinColony / colonyNetwork

Colony Network smart contracts
https://colony.io/
GNU General Public License v3.0
438 stars 106 forks source link

Introduce tracing for front-end #1274

Closed kronosapiens closed 3 weeks ago

kronosapiens commented 1 month ago

Add support for transaction tracing, mostly for the front-end development environment.

Companion to colonyCDapp#2691

area commented 1 month ago

Added initialDate config option myself because I want to try and use this today :smile:

area commented 1 month ago

Hmmmm, any thoughts as to what's going on with the tests in this PR?

kronosapiens commented 1 month ago

@area I really don't have any idea. My first thought would be that somehow adding the chains argument to hardhat.config somehow is causing ECONNRESET, but I'm not sure what the mechanism would be there.

kronosapiens commented 1 month ago

@area the culprit seems to be here: https://github.com/JoinColony/colonyNetwork/pull/1274/files#diff-cf4ef7c51dc9f81cad1d504da0d1c3a3437ac7b7d1374ee7127886cf1d1a5092R17

Haven't gotten to the bottom of why yet -- something about how the plugin gets initialised that affects port behavior? Perhaps you may have some insight.

area commented 1 month ago

So the underlying issue is some sort of timeout somewhere in the stack, of the order of 20-30 seconds. The main call that's causing it to trigger is a confirmNewHash call, which does quite a lot (deploys new contracts, a lot of chatter between contracts), and from looking at top while trying to make such a call the CPU and memory usage balloon when it's happening.

I think by including hardhat-tracer, the node ends up doing a lot of extra work that we never use, tracing the call as it's made. If I make multiple estimateGas calls for the call at the same time, several will time out after 20-30 seconds (and do so simultaneously withing milliseconds, yet between batches there's a much larger variance). I've added ternary (not an if statement, much to my despair, because of eslint) such that, if the node command is being used, we won't load hardhat-tracer, and so performance returns to where it was.

I do wonder if our remaining spottiness is as a result of this (to some extent), and we're on-the-edge, so occasionally we are hit by the timeout if we get a particularly poorly performing instance on Circle. If that's the case, I would expect using RetryProvider everywhere we can to improve reliability of the tests (and I am intending to do this).

From my testing, the node does not need to have the plugin for tracing to work, so there are no knock-on effects in terms of https://github.com/JoinColony/colonyCDapp/pull/2691, which should continue to work as it. If we started to want to use programmatic access to traces in our tests, I could see that being a problem, but for now, I'm hopeful this is sufficient.

area commented 4 weeks ago

Okay, now I don't know what's going on. This branch and my relayer branch are now failing in the same way butdevelop is fine.

Am I cursed? I can't really see any common changes between the two. I have a vague solution but I'd really rather understand what's going on.

area commented 4 weeks ago

I'm fairly sure it's running out of available memory, but I'm stuck as to why. I don't believe it's a memory leak, and it doesn't seem to be due to left over data in the reputationStates SQLite DB.

kronosapiens commented 4 weeks ago

Going back to the import style, would it be possible to do something along the lines of

task("trace", Run trace").setAction(async () => {
  require("hardhat-tracer"); // eslint-disable-line global-require

  await runSuper();
});

Not sure if it'll solve the issue but more explicitly isolating the import might help

area commented 4 weeks ago

I can't see how that would change things. Feel free to try, but I don't think it will solve it given the remaining test failure we are also seeing on an unrelated branch?

area commented 3 weeks ago

Hmmmmm. So that change means that it's not including tracer when we're doing hardhat test or hardhat coverage, and if we do we end up with (what is still a guess of) an out of memory issue.