Closed platinum4 closed 4 years ago
I didn't know anyone was still minig hodl. Thanks for reporting. I'll fix it in the next release.
Please report any other issues with hodl. If you're willing to do some verification testing maybe I can clean it up a bit.
Sure, let me know please. Thanks again for your hard work.
The "BUG" is in the way share results are reported. This may account for the missing share stats so you might want to wait for the next release to report other issues.
The invalid share is a concern, do they happen often?
Not sure, the miner closes after that first share, I had to add a pause line to .bat file to catch it before it closed the window.
It happens every 1st share though so it doesn't mine effectively at all.
You can open a new issue for that or we can track it here. Is there any error message on exit? A segfault? or just a silent exit?
Use the command line directly to get better error info.
Silent exit, hang on, how can I enable logging for you on this I can't see it in readme or --help
I suggest when you identify a problem you go back to older releases to compare. Fining the release that broke it is 90% of solving the problem. There hasn't been much activity in the hodl code so you should go back by major releases, then focus in on the minor release.
Copy /paste is a problem with the command line but if you use the command line directly you can capture a screenshot of the output.
Just to confirm the problem, do you always get one good share then an invalid share then exit?
You can also add -D to the command line to enable debug output.
Confirm yes v3.11.9 does what you are asking
I'm trying v3.9.5.3 right now
I hadn't noticed the multiple unconfirmed submits before. That could be a server or networking issue.
You should note the older releases will have different output, the focus should be on the share results and the silent exit, not the messages. Silent exits is the worst kind because there are no clues.
Yeah I know, v3.9.5.3 does not exit, look below
cpuminer-avx2 -a hodl -o stratum+tcp://hodl.optiminer.pl:5555 -u HFX3N3DZKMjsJsRTwfAgq4iqoCSNh12Qfb -p x -D
Looks like 2 sepertate problems (3 with the "BUG").
Invalid shares since before v3.9.5.3. (suggest testing 3.8.8.1)
Silent exit after v3.9.5.3.
Both are show stoppers. The replies are slow but appear to arrive eventually, you just don't see them when the program crashes before they arrive.
It's an interesting pattern, one burst of 6 shares submitted with the first one being valid and the others invalid, repeat. This pattern deson't look like anything the miner would do
The miner just does the same thing over and over again, when it gets a result it thinks is valid it submits it to the server, the server verifies it and sends a reply.
I think the server is doing something funny. It could explian the burst of shares, the delayed replies and the result pattern. I can't think of anything the miner could do to produce these symptoms.
You could also test with hodlminer and wolf-hodleminer to confirm it''s a server issue.
v3.8.8.1 works
v3.9.4 works
You can ignore much of my previous post. You don't need to test with another miner if you find a working version of cpuminer-opt.
I still don't understand how the miner could produce the pattern but when the exact release is identified I'll have somewhere to look.
v3.9.5.2 appears to not work; look at how it is receiving jobs
job -1
seems to be the issue I would be guessing.
Same job -1
on v3.9.5.1
job -1
v3.9.5
v3.9.4 is the last version that worked with -a hodl
if you get a chance to check this out that'd be great dude.
Even the miner loader has mentioned this has been a problem for a bit
I don't understand your last post, what's a miner loader?
Thanks for the good work, I now have a lead I can follow up.
Powershell script for the miner in a multiminer, anyhow, I think it happened during restructuring from v3.9.4 to v3.9.5
I've reproduced the problem exactly on Linux except for the silent exit., Linux keeps hashing. V3.9.5 was a big release with big changes to hodl and core code.
Thanks buddy, I'm sure once you get it fixed there when it's compiled to Windows it should be fine.
I think I understand the crash, the jobid is too long and causes a buffer overflow when submitting a share.
I also have a fix for the "BUG" . It now shows some share stats but not all.
But I still get the burst of submits followed by one accepted and the rest rejected. That is a very strange pattern that will require some time to think about what could cause it.
The goal is to get the shares working and live with whatever stats issue remain as they are specific to hodl and likely due to the hodl stratum not providing the data.
If I don't have a solution for the rejects soon I will release what I have ( I was getting ready for a release when you reported the issue) and then look deeper into hodl. You wojuld be able to confirm the crash was fixed on Windows, if nothing else.
The excessive messages is a problem but a low priority.
Sure, I am willing to test whatever if you need it, not sure if you have a version history between 3.9.5 and 3.9.4
Oh yes I have the version history, 4 snapshots between 3.9.4 and 3.9.5 but there's nothing obvious that changed. Hodl code didn't really change just a small administrative change. It appears hodl was a victim of changes to code that's used by all algos but only Hodl broke.
With some stats now working I can see what appears to be submitting the same share over and over again, the share diff is exactly the same for all shares in the burst.
This is going to be difficult.
I have seen the same share submitted by multiple threads and the same share submitted multiple times by the same thread. Both should be impossible.
The only change to hodl code was an interface change that required the same code change for every other algo as well with no ill effects.
The major change in v3.9.5 was the introduction of statistics. The stats are gathered in mining functions but should not interfere with mining in any way. again it affects all algos and only Hodl broke.
The crash is believed to have been caused by a job id that was longer than the buffer. This was the result of tracking job ids as part of the stats feature. So it is possible for the stats code to indirectly affect mining.
At this point my only lead is to follow up with the job id to see if there are other places it could overflow a buffer. This would corrupt data and result in unexplainable behaviour and we certainly have unexplainable behaviour. It would also explain why hodl broke with no significant changes to its code.The excessively long job id may the the nexus.
I'm going to go agead with the next release and pick up this issue afterwards when I can focus a little better.
In the upcoming release you should (hopefully) expect the following:
The fix for the rejects will hopefuly be in the following release.
You're good dude, as you mentioned, I'm probably the only person on earth trying to mine this at the moment.
LOL. I've considered just saying use the old version, and it may eventually come to that but I'm not ready to give up yet.
No problem maybe a fresh set of eyes after the sun has moved around might help
Have you tried v3.12.0 yet?
Did it crash?
Yes you can see it exit to DOS prompt due to a crash after that first attempt at a share.
It's better if you don't use a bat file for testing, especially when debugging a crash or silent exit.
So, it still crashes. That means I fixed a different crash. This is a step backward.
I'm going to have to take a different approach. I'll ignore the crash fo now and focus on the changes in 3.9.5 that broke it. I'll try removing some of that new code to see if I can fix the rejects without breaking anything else. If I can identify what code broke it I can figure out why.
Fixed it!
It was a stupid error in hodl code, I used the type instead of the variable name in a function call. I have no idea why it compiled but that bug explains why every thread was trying to submit the same hash.
The stats also work. The only remaining issue I see is the repeated job logs. That is for another day. I'll be releasing the reject fix soon.
There is also the delayed replies but that's not a miner issue.
The repeated job logs may not be fixed. The problem is unique to hodl but the solution would affect performance of all algos. The fix is to compare the job ids before displayig the log but that involves an expensive string comparison in code used by all algos.
Isn't it fun to hunt down a genuine problem every now and again, thanks so much for your effort dude!
But I'm pissed the compiler didn't catch it. The arg is supposed to be a variable of the type, not the type itself. It should have been a compile error.
I agree based on the coding I did in seventh grade what you are saying makes sense if it should not have been passable I don't see how it compiled.
cpuminer-opt-3.12.0.1 is released. Please test and report any problems.
Looking good my dude, seems to be accepting shares as normal. Thanks again!
Hi, please see the bug below:
i7-4790K 32GB RAM