MrPig91 / PSChiaPlotter

A repo for powershell module that helps Chia Plotting
MIT License
181 stars 47 forks source link

Two ques starting in parallel, when only 1 should start. #190

Open HeyManGonzo opened 3 years ago

HeyManGonzo commented 3 years ago

Hi @MrPig91, thank you for developing PSChiaPlotter!

Durring plotting runs more than once two runs start at the exact same time, although this shouldn't be happening with the configuration I made. This requires me to have to kill one of those "twin processes". This results in sometimes having 3 ques of same job running in phase 1 for example. I have attached two screenshots, demonstrating the issue.

Is this a bug, or am I doing something wrong?

Thanks, HeyManGonzo

Screenshot 2021-07-22 at 10 24 30

Screenshot 2021-07-22 at 10 47 11 Screenshot 2021-07-22 at 12 28 16
MrPig91 commented 3 years ago

This could be a bug, but I am hard determining what exactly is happening from what you described and your screenshots. I do appreciate the screenshots as it does help quite a bit when looking at these types of things. I can see that there are 3 runs in phase 1, but 2 are from one job and 1 is form another. The phase 1 limitor only applies to runs connected to that job. Are you seeing one job have 3 runs in phase 1 when you have it set to a limit of 2?

HeyManGonzo commented 3 years ago

Yes, as shown in the bottom screenshot: there are 3 runs in phase 1 of the "...Kingston" Job. There should only be 2 as the limiter is set to 2. When this occurs (intermittingly) the jobs have always started at almost the same time. Might it be that when the script starts.a new run a second run is started before the script realises that already 2 threads are running in phase 1 of that job? The top screenshot shows that the "twin runs" are already in phase 2 and that afterwards other runs have started in a normal way.

Jacek-ghub commented 3 years ago

@HeyManGonzo

Maybe you could give MadMax plotter a shot? The official chia.exe plotter is rather inefficient, as such we do need to run several instances in parallel, what is causing headaches as you see. On the other hand, MM plotter can utilize all available cores in just one instance. What it also means the strain on your SSDs will be higher, as one plot should finish in less than one hour.

@MrPig91

Would it be possible to not specify the number of replots, and have PSCP take care of the whole folder? I would imagine that people have basically the same layout across all drives, as such maybe one could specify just a bunch of drives, and provide just one set of folders (e.g., e:, f:, ... and \og + \pool), and leave PSCP humming for days.

HeyManGonzo commented 3 years ago

(off-topic) @Jacek-ghub: I've plotted several hundred plots with MM and like the plotter, but it's not ideal for my setup as I also farm on the PC. With parallel plotting I can achieve ±23 k32's/day. With MM I can 24/25 plots per day. But MM maxes out my 8cores/16threads i7-10700 as a result the proof response search times climb and I get missed challenges. Since luck is involved with farming Chia I don't want to lower my changes of finding a block and am now parallel plotting. Re: NVMe I've plotted ±1000 plots on my oldest 2TB m.2 and am currently at 71% health, if I can stretch it for another 1000 plots, I am more then happy.

Jacek-ghub commented 3 years ago

@HeyManGonzo I think that we all are missing a good writeup about how to use each plotter, etc. I saw several on YT, but no one was complete, so I am trying to absorb what is there, and experiment locally. Actually, I have i9-10900 as a plotter, and with chia.exe I was getting 30-32 plots/day, but with MM it is 40-42 plots/day.

I guess, the main difference between MM and chia.exe is that since MM can take over all your cores, it means that all the disk access is heavily condensed. Therefore, if you are pushing your system to the max (e.g., giving 8+ threads on your 10700), you would need to use two temp folders (where chia.exe can still run on a single one). I guess, that would help MM to outperform your chia.exe. I tried to bunch two NVMes into RAID0 (using disk manager), but it ended up with the same speed as a single NVMe.

Also, to control MM, you can give it just 6 cores, or less. Remember, your CPU has only 8 physical cores, so anything above 8 is kind of incremental gain (picking up free time, when the other processes were waiting for disk access), so it will strain your SSDs in the same way. This way, you can leave more CPU time for your harvester.

On the other hand, with chia v1.2.0+, harvester disk access code was degraded, so your harvester potentially cannot handle the same amount of plots/disks as it was doing before. Having plotter on the same machine doesn't really help. I have a mobile CPU that runs as a full node and a harvester, but I had to remove a couple of drives from that box, as timeouts were getting worst with v1.2.0 update.

So, my take is that whatever was working before v1.2.0 is not really capable of pulling the same weight right now. Yeah, trying to control the plotter is one way to go, but the problem is really with the harvester code.

By the way, PSCP has a superb heat maps module, that I would suggest you run all the time against your harvesters. It can give you heads up, when things change, or you overload your system.

imClement commented 3 years ago

Hello,

I confirm the same behavior of PSCP in my case (4 machines) plotting with MM (no re-plotting) .

MrPig91 commented 3 years ago

@HeyManGonzo Okay I see it now! Sorry I didn't see it earlier, I think I missed the last picture the first time I was looking. Yes you are exactly right, PSCP fails to see that a new job has started. Honestly I should probably rewrite how PSCP starts new chia processes for a number of reason including this one. I am sorry that this happening, I thought I fixed it, but it appears it is still happening.