Tests randomly crashing at ProviderError.ExtendableError on Ubuntu (Linux)

dwalintukan commented 6 years ago

[X] I've asked for help in the Truffle Gitter before filing this issue.

Issue

On a Ubuntu Linux environment (Trusty), tests randomly fail with this ExtendableError:

1) Contract: TopicEvent "after each" hook for "throws on an invalid result index":
Error: Could not connect to your Ethereum client. Please check that your Ethereum client:
    - is running
    - is accepting RPC connections (i.e., "--rpc" option is used in geth)
    - is accessible over the network
    - is properly configured in your Truffle configuration file (truffle.js)

      at ProviderError.ExtendableError (/home/travis/.nvm/versions/node/v8.9.3/lib/node_modules/truffle/build/cli.bundled.js:9401:17)
      at new ProviderError (/home/travis/.nvm/versions/node/v8.9.3/lib/node_modules/truffle/build/cli.bundled.js:325054:24)
      at /home/travis/.nvm/versions/node/v8.9.3/lib/node_modules/truffle/build/cli.bundled.js:325137:17
      at /home/travis/.nvm/versions/node/v8.9.3/lib/node_modules/truffle/build/cli.bundled.js:325195:24
      at XMLHttpRequest.request.onreadystatechange (/home/travis/.nvm/versions/node/v8.9.3/lib/node_modules/truffle/build/cli.bundled.js:328229:7)
      at XMLHttpRequestEventTarget.dispatchEvent (/home/travis/.nvm/versions/node/v8.9.3/lib/node_modules/truffle/build/cli.bundled.js:176415:18)
      at XMLHttpRequest._setReadyState (/home/travis/.nvm/versions/node/v8.9.3/lib/node_modules/truffle/build/cli.bundled.js:176705:12)
      at XMLHttpRequest._onHttpRequestError (/home/travis/.nvm/versions/node/v8.9.3/lib/node_modules/truffle/build/cli.bundled.js:176895:12)
      at ClientRequest.<anonymous> (/home/travis/.nvm/versions/node/v8.9.3/lib/node_modules/truffle/build/cli.bundled.js:176765:24)
      at Socket.socketOnEnd (_http_client.js:423:9)
      at endReadableNT (_stream_readable.js:1056:12)
      at _combinedTickCallback (internal/process/next_tick.js:138:11)

Specifically, I have a Travis-CI (continuous integration) setup and this is where the tests are failing. My local Mac OSX environment passes these tests with no problem. Every once and a while, they will fail with the same error, but I just run the tests again and they pass.

I'd say it happens like 10-15% of the time on Mac OSX, but it happens like 60-80% of the time on the Travis-CI linux env.

It feels like it used to have this error less on earlier Truffle versions. I just updated to 4.0.4 and it seems way more often now.

Steps to Reproduce

Clone this repo: https://github.com/bodhiproject/bodhi-core
Use Truffle v4.0.4 on a Ubuntu Linux environment (see environment details)
truffle test

Expected Behavior

Tests should pass like they do on Mac OSX env.

Actual Results

I test this on my local machine (mac osx), when all tests pass which they do, I push up to Github. Then it fires off a Travis-CI test on the linux env and fails pretty much every time.

Environment

Travis-CI Env (fails)

Operating System: Ubuntu 14.04 Trusty
Truffle version: 4.0.4
node version: v8.9.3
npm version: 5.5.1
nvm version: 0.33.8
Build language: node_js
Build group: stable
Build dist: trusty
Distributor ID: Ubuntu
Description: Ubuntu 14.04.5 LTS
Release: 14.04
Codename: trusty
Compiled with gcc 4.8.2 for Unix (Linux ELF) on Oct 21 2013.
OS/Arch: linux/amd64

Mac OSX Env (passes)

Mac OSX 10.12.6 Sierra
Truffle v4.0.4
node version: 8.9.1
npm version: 5.5.1

$ gcc --version:

Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 9.0.0 (clang-900.0.39.2)
Target: x86_64-apple-darwin16.7.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

barakman commented 6 years ago

@cgewecke: Some more analysis and an exact pinpoint of the problem (a solution if you will):

I have added a counter to test whether the number of opened HTTP requests in the system grows as the number of waiting ports seems to be growing.

I increase this counter after request.send(JSON.stringify(payload)) and decrease it in function request.onreadystatechange (upon request.readyState === 4).

The counter is zero when the failure occurs, which means that there are no opened requests at that point (this is in contrast with the number of waiting ports).

I have found a post suggesting that the TIME_WAIT period is configurable in the OS, but that would be an OS-dependent solution, which I'd really hate.

I have done some reading on the XMLHttpRequest object, to see if I could somehow use it in order to "signal someone on the system" that the request is done and that the port can be closed.

I haven't found any such option (in fact, I think that there is no really such thing as "HTTP connection close" defined in the HTTP standard).

I did notice, however, that for asynchronous requests, you are using XHR2 instead of XMLHttpRequest.

I'm not sure about the difference between these two, I only understand that the former is a NodeJS wrapping of the latter (which is a "Javascript native type").

Nevertheless, when I change the code to use XMLHttpRequest instead of XHR2, the test runs to completion!!!

Oddly enough, when the test is done, there are still some 16,000 ports in TIME_WAIT state.

However, this time, in addition to (something like) this:

  TCP    127.0.0.1:49155        127.0.0.1:8545         TIME_WAIT       0
  TCP    127.0.0.1:49157        127.0.0.1:8545         TIME_WAIT       0
  TCP    127.0.0.1:49165        127.0.0.1:8545         TIME_WAIT       0
  ...
  TCP    127.0.0.1:65532        127.0.0.1:8545         TIME_WAIT       0
  TCP    127.0.0.1:65533        127.0.0.1:8545         TIME_WAIT       0
  TCP    127.0.0.1:65534        127.0.0.1:8545         TIME_WAIT       0

I also see (something like) this:

  TCP    127.0.0.1:8545         127.0.0.1:49152        TIME_WAIT       0
  TCP    127.0.0.1:8545         127.0.0.1:49153        TIME_WAIT       0
  TCP    127.0.0.1:8545         127.0.0.1:49154        TIME_WAIT       0
  ...
  TCP    127.0.0.1:8545         127.0.0.1:65531        TIME_WAIT       0
  TCP    127.0.0.1:8545         127.0.0.1:65534        TIME_WAIT       0
  TCP    127.0.0.1:8545         127.0.0.1:65535        TIME_WAIT       0

I'm not sure why exactly the test completes, and whether or not we can even consider the replacement of XHR2 with XMLHttpRequest a solution (though it does seem like a good workaround by the least).

But I think that we should focus our investigation on the difference between these two.

Thanks.

cgewecke commented 6 years ago

@barakman Great work! So glad you got that suite running.

Was also googling around about this yesterday and saw a thread that suggests another possibility is to pass a special header into the request telling it to close the connection when done, since the default behavior for HTTP is keep-alive. Example:

var options = {host: 'graph.facebook.com',
               port: 80,
               path: '/' + fb_id + '/picture',
               headers: { 'Connection':'Close' }
};

The relevant web3 code is here.

Truffle invokes that constructor at truffle-provider here. If the problem can be addressed by adding headers there we'd be able to fix this directly.

If not it's quite a bit more complicated - web3 is a library written and maintained by the Ethereum Foundation. We consume (rather than write) it and it's non-trivial to get the code changed there (for good reason since that code drives much of the Ethereum JS eco-system).

If you're still investigating this and have a chance, could you see if setting the headers that way also resolves this?

barakman commented 6 years ago

@cgewecke: Thank you!

I think that we should by the least inform web3 authors / contributors of the XHR2 findings. I am inclined to think that it might impact other open issues.
I am using the Mocha framework for testing (and I was pretty sure that truffle test relies on that anyway). The web3 class is globally available in all of my tests (not sure if because of Mocha or because of Truffle). So I'm not quite sure how or where to add this configuration. Is it possible to add it in the truffle configuration file? If no, how else can I go about applying it? Writing it in every test seems kinda overkill. Nevertheless, I will try it on the specific case at hand and let you know if it solves it. Am I understanding you correctly, that you just want to see if it resolves the problem, so that you can fix Truffle accordingly? I'm not entirely sure how to add it neither in my test nor in Truffle's cli.bundled.js. Would it be sufficient to change this: provider = new Web3.providers.HttpProvider("http://" + options.host + ":" + options.port); To this: provider = new Web3.providers.HttpProvider("http://" + options.host + ":" + options.port, 0, '', '', [{name: 'Connection', value: 'Close'}]); In file cli.bundled.js?

Update:

For the code fix above, I get a message from Truffle (or from the Ethereum client):

Refused to set unsafe header "Connection"

I Googled it, and found this StackOverflow answer and this Web3 GitHub thread.

Do you have another suggestion?

Thanks.

UPDATE:

You can workaround the Refused to set unsafe header "Connection" error as follows:

Locate the XMLHttpRequest.prototype._restrictedHeaders object.
Remove the connection key or change its value from true to false.

However, the bottom line result remains unchanged (i.e., the initial problem persists).

cgewecke commented 6 years ago

@barakman Ah no, sorry I don't - I guess that's a dead end. Hmmmm.

barakman commented 6 years ago

@cgewecke: So the only option currently at hand is to connect in package.json a script which will modify Truffle source code, and call that script after npm install and before npm test?

barakman commented 6 years ago

@cgewecke: BTW (and yet again), there seem to be several different versions of HttpProvider.prototype.prepareRequest "bundled together" in the same Truffle package.

One of them actually uses an XMLHttpRequest object for asynchronous requests, which is how we'd like it to be.

The way I see it, there are two options here:

Web3 has introduced the use of XHR2 some time ago.
Web3 has revoked the use of XHR2 some time ago.

The first case might make it easier to push forward towards reverting this change, which seems harmful. The second case is even better - simply move Truffle to use the newer version of Web3.

See below the various occurrences of HttpProvider.prototype.prepareRequest in the code.

Occurrence 1:

HttpProvider.prototype.prepareRequest = function (async) {
    var request;

    if (async) {
      request = new XHR2();
      request.timeout = this.timeout;
    }else {
      request = new XMLHttpRequest();
    }

    request.open('POST', this.host, async);
    request.setRequestHeader('Content-Type','application/json');
    return request;
};

Occurrence 2:

HttpProvider.prototype.prepareRequest = function (async) {
    var request = new XMLHttpRequest();
    request.open('POST', this.host, async);
    request.setRequestHeader('Content-Type','application/json');
    return request;
};

Occurrence 3:

HttpProvider.prototype.prepareRequest = function (async) {
    var request;

    if (async) {
      request = new XHR2();
      request.timeout = this.timeout;
    }else {
      request = new XMLHttpRequest();
    }

    request.open('POST', this.host, async);
    request.setRequestHeader('Content-Type','application/json');
    return request;
};

Occurrence 4:

HttpProvider.prototype.prepareRequest = function (async) {
  var request;

  if (async) {
    request = new XHR2();
    request.timeout = this.timeout;
  } else {
    request = new XMLHttpRequest();
  }

  request.open('POST', this.host, async);
  if (this.user && this.password) {
    var auth = 'Basic ' + new Buffer(this.user + ':' + this.password).toString('base64');
    request.setRequestHeader('Authorization', auth);
  } request.setRequestHeader('Content-Type', 'application/json');
  if(this.headers) {
      this.headers.forEach(function(header) {
          request.setRequestHeader(header.name, header.value);
      });
  }
  return request;
};

Thanks

cgewecke commented 6 years ago

@barakman Which version of truffle are you using? I will track that down and if this can be fixed by normalizing web3 versions will do that ASAP.

barakman commented 6 years ago

@cgewecke:

At present, I am using Truffle v4.1.3, with my Solidity contracts under v0.4.18. I am planning to move to Truffle 4.1.5 as soon as I have an idle slot, but that will force me to upgrade my Solidity contracts to v0.4.23, and due to the syntactical changes (namely emit, constructor and the deprecation of var), that idle slot will have to be a little wider than what it would take to just change Truffle version in package.json.

In short, I will be happy if this change (if indeed applicable) becomes available on Truffle v4.1.3, but Truffle v4.1.5 will also do just fine.

Thanks again for all your help!

barakman commented 6 years ago

@cgewecke: Of course, it still needs to be asserted that this fix is not just some coincidental result due to the "timely-nature" of the problem (i.e., we must be able to explain it based on the functional difference between XHR2 and XMLHttpRequest).

barakman commented 6 years ago

@cgewecke:

A satisfactory proof: In the HttpProvider.prototype.sendAsync function, I added console.log(request.getAllResponseHeaders()) upon response (in the onreadystatechange callback function).

When the HttpProvider.prototype.prepareRequest function uses XHR2, the printout form is:

content-type: application/json
vary: Origin
date: ...
content-length: ...

When the HttpProvider.prototype.prepareRequest function uses XMLHttpRequest, the printout form is:

content-type: application/json
vary: Origin
date: ...
content-length: ...
connection: close

cgewecke commented 6 years ago

@barakman

Web3 has introduced the use of XHR2 some time ago.

Web3 has revoked the use of XHR2 some time ago.

Unfortunately it looks like case 1 is true. XHR2 is used in the latest web3 0.x as wells as web3 1.0. Have also tried running your reproduction case using web3 1.0 over websockets without luck. . .

This issue raises questions about whether web3 / truffle / ganache are really suited to running simulations with tens of thousands of calls. There might be significant value in building a tool that ran tests directly on top of ethereumjs-vm, or perhaps inside ganache, avoiding http overhead and other constraints.

barakman commented 6 years ago

@cgewecke:

I did a little reading, and it seems that connections are closed by default in HTTP 1.0 and kept alive by default in HTTP 1.1. And I'm guessing that XMLHttpRequest supports HTTP 1.0 while XHR2 supports HTTP 1.1, so it makes sense that Web3 has switched from XMLHttpRequest to XHR2 and not vice-versa.

As with regards to the second part of your comment, please note that I have experienced the same problem when using solidity-coverage along with testrpc-sc. And as far as I understand, those two are designated specifically for the purpose of "running simulations with tens of thousands of calls" (how else would you achieve a complete coverage of your contracts?).

For now, I have added the following workaround on my system:

Next to file package.json, added file fix-truffle.js :

FILE_NAME = "./node_modules/truffle/build/cli.bundled.js";
let fs = require("fs");
let oldData = fs.readFileSync(FILE_NAME, {encoding: "utf8"});
let newData = oldData.replace(/new XHR2/g, "new XMLHttpRequest");
fs.writeFileSync(FILE_NAME, newData, {encoding: "utf8"});

In file package.json, added:

"scripts": {
"install": "node fix-truffle.js"
}

Thanks.

barakman commented 6 years ago

@cgewecke - just to finalize this issue (also for future readers): The fix suggested above indeed seems to resolve the Could not connect to your Ethereum client problem discussed in this thread. However, it exposes yet another problem:

Invalid JSON RPC response: "Error: socket hang up
  at createHangUpError (_http_client.js:331:15)
  at Socket.socketOnEnd (_http_client.js:423:23)
  at emitNone (events.js:111:20)
  at Socket.emit (events.js:208:7)
  at endReadableNT (_stream_readable.js:1056:12)
  at _combinedTickCallback (internal/process/next_tick.js:138:11)
  at process._tickCallback (internal/process/next_tick.js:180:9)"
  at ProviderError.ExtendableError (C:\Users\...\webpack:\~\truffle-error\index.js:10:1)
  at new ProviderError (C:\Users\...\webpack:\~\truffle-provider\error.js:17:1)
  at C:\Users\...\webpack:\~\truffle-provider\wrapper.js:71:1
  at C:\Users\...\webpack:\~\truffle-provider\wrapper.js:129:1
  at exports.XMLHttpRequest.request.onreadystatechange (C:\Users\...\webpack:\~\web3\lib\web3\httpprovider.js:128:1)
  at exports.XMLHttpRequest.dispatchEvent (C:\Users\...\webpack:\~\xmlhttprequest\lib\XMLHttpRequest.js:591:1)
  at setState (C:\Users\...\webpack:\~\xmlhttprequest\lib\XMLHttpRequest.js:610:1)
  at exports.XMLHttpRequest.handleError (C:\Users\...\webpack:\~\xmlhttprequest\lib\XMLHttpRequest.js:532:1)
  at ClientRequest.errorHandler (C:\Users\...\webpack:\~\xmlhttprequest\lib\XMLHttpRequest.js:459:1)
  at Socket.socketOnEnd (_http_client.js:423:9)
  at endReadableNT (_stream_readable.js:1056:12)
  at _combinedTickCallback (internal/process/next_tick.js:138:11)
  at process._tickCallback (internal/process/next_tick.js:180:9)

This problem seems to be of the following nature:

It happens only at the beginning of a test which is conducted after a "massive" one
It is a subset of the original problem, i.e.:
- Without the fix, the old problem will necessarily appear where the new problem does
- With the fix, the new problem will not necessarily appear where the old problem does

I believe that a possible fix for this problem is in the XMLHttpRequest function, around the area of:

request = doRequest(options, responseHandler).on("error", errorHandler);

Perhaps there's a missing handler for this request, for its socket, for its response or for its response's socket.

In either case, I have not been able to resolve it. Most of my attempts were focused on searching NodeJS HTTP API for functions and/or events which might be used here.

The fact that a "massive" test completes successfully, but only when it takes place, does the next test emit this error (immediately when it begins) should give some hints, but I'm not sure what. It seems that the "massive" test does not release the socket when it is held for a long period (cutting this test shorter resolves the problem).

A simple workaround for this problem is to execute truffle test separately for each test file. In other words, closing and reopening Truffle solves the problem, which implies that some resource (a socket?) is not released until Truffle is closed.

Unfortunately, this workaround is insufficient for solidity-coverage users (myself being among them), since this utility cannot be executed separately for each test file.

If someone can find a way to apply this ("close and reopen after every test file") in Truffle source code itself, then it might be a good solution.

I tried that too - in the Test.run function, at line js_tests.forEach(function(file)... - but couldn't quite get it to work.

barakman commented 6 years ago

@cgewecke:

I have managed to fix (or if you will, find a workaround for) the socket hang up issue described above, which has emerged after I had resolved the original issue (by replacing XHR2 with XMLHttpRequest).

As mentioned before, this socket hang up error seems to be pretty consistent in the fact that it happens only at the end of a massive test (or perhaps at the beginning of the test that follows).

A deeper investigation has shown that it always happens as a result of a request consisting of payload.method === 'evm_revert', to which the response is an error message (and obviously an invalid JSON).

A glimpse at Ganache source code reveals that evm_revert is indeed executed at the end of each test (using afterEach).

Though I don't have any real evidence to support this, I think that it is possibly because an evm_revert executed after a massive test takes a very long time to complete, during which the connection is timed out.

By the way, the status of this response is 0. I previously bumped into some GitHub thread referring to why you've decided not to ignore status 0 in Truffle (the reason being that a test might fail silently, if I remember correctly). I can't find this thread now, but you were in it, so you might find the remaining of this comment relevant.

In any case, in order to workaround the socket hang up error, I simply fixed Truffle source code to ignore an error in the response if the request's payload.method is evm_revert.

Since evm_revert is not really a part of any test which I could possibly run on Truffle, I am confident that this fix cannot do any harm, for example (yet again), allow a test to fail silently.

Here is the extended workaround (for both problems), for any future readers:

Next to file package.json, add file fix-truffle.js:


let FILE_NAME = "./node_modules/truffle/build/cli.bundled.js";

let TOKENS = [ {prev: "request = new XHR2", next: "request = new XMLHttpRequest"}, {prev: "error = errors.InvalidResponse", next: "error = payload.method === 'evm_revert' ? null : errors.InvalidResponse"} ];

2. In file `package.json`, add:

"scripts": { "install": "node fix-truffle.js" }



Thanks

**UPDATE:**

It seems that even if a `socket hang up` error which occurs as a result of an `evm_revert` request at the end of a test is resolved (by ignoring it), a similar error may then occur as a result of an `evm_snapshot` request at the end of the next test.

We can slightly extend the workaround above to handle both cases, by changing this:

    payload.method === 'evm_revert'

To this:

    payload.method.startsWith('evm')

As `evm` requests are not something likely to be invoked directly from a testing script, I think that this extension is quite safe (i.e., will not cast away "real" errors in a given test).

However, generally speaking, I get the feeling that while Ganache takes a very long time to complete these requests in some cases (more specifically, after a massive test is conducted), the connection is simply (and abruptly) terminated.
The fact that restarting `truffle test` resolves this issue, implies that even if it is "Ganache's fault" (for taking so long to complete), it is "Truffle's fault" in handling it.

I am not very "happy" with the workaround proposed above, and I believe that a better approach would be to:
1. Investigate why Ganache takes so long to complete `evm_revert` and `evm_snapshot`.
2. Investigate why Truffle "has a problem" with the fact that Ganache takes so long to do it.

**UPDATE 2:**

For safety, extend this:

    payload.method.startsWith('evm')

To this:

    typeof payload.method === 'string' && payload.method.startsWith('evm')

Or even to this:

    payload.method === 'evm_revert' || payload.method === 'evm_snapshot'

cgewecke commented 6 years ago

@barakman Thanks so much. The workaround you've proposed seems reasonable to me. There might be some kind of connection timeout at the HTTP layer - I've also seen this disconnection when running long solidity loops that validate bytecode in a call.

@barakman Out of curiosity, would making revert and snapshot optional help with your use case?

barakman commented 6 years ago

@cgewecke:

Thank you. I assume that the purpose of these two functions is to reset the EVM emulation back to an initial state, so that each one of the tests executed by Truffle will start under the exact same conditions, regardless of the order in which the tests are executed (and of course, the exact same conditions will continue to apply every time you invoke truffle test). All of this is designated to ensure deterministic execution, I assume, so making these functions optional is probably in contrast with correct testing methodologies.

That said, since it's optional, I guess that there's no harm done (i.e., Truffle users can choose that on their own risk).

That said #2, I've already added an npm-post-install script to fix Truffle source code, so I'm not in any dire need for this feature (though, I suppose I'll have to do some maintenance work on that script every time I update Truffle version, so perhaps it WILL help me in the future).

It would help for sure if you could check with Ganache developers what might cause the execution of evm_revert and evm_snapshot be so lengthy.

Thank you for your help.

cgewecke commented 6 years ago

@barakman

It would help for sure if you could check with Ganache developers what might cause the execution of evm_revert and evm_snapshot be so lengthy.

I will. In your current suite, approximately how many blocks are being snapshotted / reverted?

barakman commented 6 years ago

@cgewecke: I have a total of 27 tests, so each one of these functions is invoked 27 times if that's what you mean. Otherwise, can you please elaborate on what you mean by "how many blocks"? Should I use web3 in order to get the block-number at the beginning and end of my longest test, and calculate the difference?

cgewecke commented 6 years ago

Apologies @barakman - yes you could do that or estimate the number of transactions that occur in the suite, since ganache executes a single tx per block.

I'd just like to give the ganache engineers a some guidance about what magnitude of tests triggers this.

barakman commented 6 years ago

@cgewecke: Just by looking at the code, I estimate that:

The test in which the evm_revert request fails, executes approximately 16943 RPCs.
The test in which the evm_revert request fails and the evm_snapshot request of following test also fails, executes approximately 28954 RPCs.

I could give you more accurate figures by getting the block number before and after, but that would take me a while (each one of them runs for about 15-20 minutes or so).

Thanks

cgewecke commented 6 years ago

That's perfect, thanks @barakman.

vicnaum commented 6 years ago

Would there be an universal fix available any time soon?

I have random Error: CONNECTION ERROR: Couldn't connect to node http://127.0.0.1:7545/ errors when I do truffle test too (have 39 tests, 3 of which fail by that reason).

In May everything was still okay, today - it's not :(

cgewecke commented 6 years ago

@vicnaum Could you provide more detail about your suite or a link to project? At the moment we think this error is limited to very large suites. The principal reporter above has a battery of 50,000 tests.

Do the same 3 tests fail each time?

vicnaum commented 6 years ago

@cgewecke it's always different tests. Can be only one test failing, but can be at most five. Usually near three. I'm using Windows 10.

The sources are here: https://github.com/vicnaum/hourlyPay

barakman commented 6 years ago

@cgewecke : The error specified by vicnaum (connection error) does not seem to have any relation whatsoever with the issue described in this thread, which appears to be the result of limited resources (more precisely, the system runs out of HTTP connections).

cgewecke commented 6 years ago

@vicnaum I think @barakman is correct - I looked through the hourlyPay code a bit and see you're using a lot of methods to move time around on the chain. Would you like to open a separate issue so we can investigate further?

ganache-cli shouldn't disconnect from truffle under any circumstances so this is likely a bug. Could you display the entire contents of your error and stack trace as well?

benjamincburns commented 6 years ago

@cgewecke & @barakman & others having this issue: I haven't dug into this too deeply, but my guess is that either Truffle or the tests in question are creating new instances of provider very frequently.

Optimal resource management would be to take advantage of HTTP keep alive by reusing provider instances between tests rather than recreating them.

I can say from experience that sending Connection: close in the request or explicitly closing the client socket only kicks the can down the road for this problem as you'll still exhaust the local address space due to ports sitting in FIN_WAIT.

cgewecke commented 6 years ago

@benjamincburns Yes, it turns out this originates at web3 and they're fixing it in beta.36.

(It was keep-alive - the change).

gnidan commented 5 years ago

Closing this since it seems to have been addressed as a duplicate of the issue above. Let us know if it's still a problem. Thank you!

barakman commented 5 years ago

@gnidan: AFAIK, this is still a problem on Truffle 4.1.15 (which still uses XHR2 instead of XMLHttpRequest).

In Truffle 5.x this is possibly fix, since this part of the code has changed, though I haven't verified that, as it requires a bit of work on both my contracts and my tests.

To my understanding, you have released 4.1.15 specifically for this reason (i.e., for those who aren't rushing to upgrade their Solc and Web3 major versions).

So you might want to keep this issue opened until fixed in the Truffle 4 branch (or at least leave a note somewhere to mention that this problem is as viable as ever).

Thanks

chris-shyft commented 3 years ago

hey fyi I'm having this issue w/ Truffle v5.3.3 (core: 5.3.3) web3 in my project is ^1.2.6, only popped up recently, 431 tests with a good density of numbers of calls per-test.

seems like it might be connected to assert manager function recently added.. this has been the first time that tests have been expected to perform any logic past the return to this (await ofc) function in the middle of the test functions.

async function assertReverts(promise, errorMessage = "") {
  try {
    await promise;
  } catch (error) {
    assert(error.toString().indexOf("VM Exception while processing transaction: revert"), "Expected VM revert error");
    assert(error.toString().indexOf(errorMessage) != -1, `Expected error: "${errorMessage}", actual error: "${error}"`);
    return;
  }

  assert.fail('Expected VM revert :: ' + errorMessage);
}

anyone aware of anything recently changed that could cause this to re-appear?

trufflesuite / truffle