Closed ianw1974 closed 6 years ago
This issue has not been replicate-able with the latest update. Can you expand on your environment.
I expect perhaps you didn't wait long enough. As explained, there is no timescale to this occuring, it's random, can do it in a few hours, two days, one day.
How to replicate:
/usr/local/bin/ghostnodelist.sh
#!/bin/bash
nix-cli ghostnode list
/usr/local/bin/nix-info.sh
#!/bin/bash
nix-cli -getinfo
last script can be getnetworkinfo instead of -getinfo, it's irrelevant which.
*/15 * * * * root /usr/local/bin/ghostnode_list.sh
*/60 * * * * root /usr/local/bin/nix-info.sh
then wait for it to fail. This can be replicated on any server, so it's not specific to any environment.
Version tested and failed
nix-cli -getinfo
{
"version": 2000300,
if you need more info, please ask exactly what you require, but problem still exists, links provided above in original post provide fix, so just need to be applied to your codebase.
Not reproducible so far while running for 1 day straight with faster cronjobs. Code you linked also does not provide any fixes for NIX, what those commits have are already fixed in NIX. You need to be more specific in your environment, how you are running NIX, what your conf reads etc.
I wrote above it can even take up to two days, the fact you ignore what I wrote by testing for 1 day just confirms it. Increasing the frequency won't make it fail any earlier, it happens at random. Already explained my environment above. nix.conf only has standard rpcuser and rpcpassword.
And the above links have the fix, syscoin fixed it by fixing the locking issues which is exactly this problem, but you fail to acknowledge it. Just like when I reported it via discord. A waste of my time. You can close the issue, I'm not interested in helping fix this, when you can't be bothered to read the above and test appropriately and properly.
I challenge you to look at our code and find where there is a locking issue that is fixed by the commits you provided(Hint: you cant). You are not giving enough info on your problems. The fact that there are 100's of ghostnodes running with 0 issues and many providers tracking diagnostics prove that this issue is minor and due to certain environments that you seem to keep creating and not giving enough info. If you cannot push a issue that you can attempt to help solve for your sake(linking code that is irrelevant to our codebase doesnt help anyone, if you dont believe me, take a few seconds comparing the code you linked instead of just linking it if you can understand what it all is saying). Closed for lack of understanding how to properly open a issue, problem is your environment.
Hello NIX team, I am bringing this back alive as i am receiving the same results. Issue happens every 2 hours - 2 days at random. I currently have to kill and reindex the whole wallet to keep the stats active on MasterNodes.Pro. We are looking into processing the data another way to lower the amount of API calls to the daemon. But our current way which works with all other wallets is having a issue. I though you would like to know this.
nixd becomes unresponsive after a certain about of time, when nix-cli/rpc calls are made. Timescale is not always the same. Can happen even 20 minutes after starting nixd, in most cases failing between 1 - 3 times in a 24 hour period, some cases can run up to two days without issues. The following types of calls are made:
Every 15 minutes: nix-cli ghostnode list full Every one hour: nix-cli -getinfo (can also be substituted with one of or a combination of getblockchaininfo, getnetworkinfo)
Amount of calls being made are not unreasonable, nor excessive. The following behaviour has been noted.
nix-cli -getinfo error: couldn't parse reply from server
nix-cli getblockchaininfo error: couldn't parse reply from server
nix-cli getnetworkinfo error: couldn't parse reply from server
2018-09-14 02:20:02 socket sending timeout: 1201s 2018-09-14 02:20:02 socket sending timeout: 1201s 2018-09-14 02:20:02 socket sending timeout: 1201s 2018-09-14 02:20:02 socket sending timeout: 1201s 2018-09-14 02:20:02 socket sending timeout: 1201s 2018-09-14 02:20:02 socket sending timeout: 1201s 2018-09-14 02:20:02 socket sending timeout: 1201s 2018-09-14 02:20:02 socket sending timeout: 1201s
which then causes the following messages to appear when nix-cli/rpc calls are made:
2018-09-14 09:30:01 WARNING: request rejected because http work queue depth exceeded, it can be increased with the -rpcworkqueue= setting 2018-09-14 09:30:01 WARNING: request rejected because http work queue depth exceeded, it can be increased with the -rpcworkqueue= setting
The errors in the debug.log relating to rpcworkqueue are false information, attempting to edit and change this parameter in nix.conf
by increasing the value doesn't resolve the problem. The problem is actually a locking issue with nixd. This problem has been seen before with Fixed Trade Coin, and also Syscoin. Both fixed the issues by addressing the locking problem. This can be found on Syscoin's github, from commits around May 5. The following commits are related to this issue with the appropriate fix (this exact same behaviour was noted with both Fixed Trade Coin and Syscoin and reported to them through their Discord/Slack channels):
https://github.com/syscoin/syscoin/commit/dbe0afd572d8a71e3333b4a9d019a9af8877d0e5
https://github.com/syscoin/syscoin/commit/6f1e10f355617ca9d5d027c038ee1e8221351e26
Attached debug.log.
Platform: Ubuntu 16.04 x86_64. Compiled as per Nix Platform instructions.
debug.log