MrPig91 / PSChiaPlotter

A repo for powershell module that helps Chia Plotting
MIT License
182 stars 47 forks source link

[feature] Conditional Jobs #123

Open imClement opened 3 years ago

imClement commented 3 years ago

Hello,

What do you think about implementing conditions for jobs to start.

Eg, JobB will start ony after JobA is completed.

I think this will be useful if you plan to plot on multiple smaller HDDs that will be moved to another system when filled up.

JobB will start plotting only for HDD B only after the Job A is completed (therefore HDD A is full).

That way, we can add an remove HDDs as needed.

Is there other way to accomplish this?

Thank you,

Jacek-ghub commented 3 years ago

Hi @ClementN,

At the moment, the biggest issue with plotting (not just using PSChiaPlotter) are the copy collisions when plots are done. Therefore, PSCP is distributing that job across several HDs to mitigate that problem. So, this is kind of a competing approach to what you said, and potentially leading to a better performance (if you monitor your box to swap those HDs when needed).

What you are asking for (IMO) is to instead of adding destination folders, is to be able to add destination folders queues. At least that is how I would look at that problem.

The reason to use that approach (destination queues) is that any user action is prone to problems, so trying to schedule a new job at 1am is usually leading to discovering your mistakes at 8am, and trying to recover from that.

As such, I would rather schedule a "forever" job, and give it destinations queues like "x:, y:, z:" then "p:, q:, r:", etc. This way, PSCP would start that forever job using x: and p: drives, when they would be full, switch to y: and q:, and so on. That would give you plenty of time to swap x:, y:, p:, q: HDs when PSCP is on z: and r: drives. I would assume that PSCP would still top off those drives before switching back to x: and p: to start the process over.

With that approach, you would have one forever job, you would depend on PSCP to do the best to stack plots up, to utilize RAM and threads in a best way it can.

Thank you, Jacek

Jaga-Telesin commented 3 years ago

The way I avoid collisions is to run Stablebit Drivepool on the server hosting the data drives. Coalesce the drives into one volume, then share that volume on the network. When I copy >1 plot to that volume (can happen when my workstation finishes a plot and the server finishes one itself simultaneously), Drivepool spreads the load between the drives. I think I've only seen a collision once the entire time I've been plotting.

Jacek-ghub commented 3 years ago

Hi @Jaga-Telesin,

You are one of very few IT pros here, so no wonder your solutions are really well done! Love to learn from your experience.

Maybe installing that DrivePool on the plotting machine would solve @ClementN issue? Can you add/remove drives from it while it is working?

I see collisions from time to time on my box (9 plots in parallel), but only because I am still fine tuning that box and I have RDP connection to it all the time, and my setup hit a minor PSCP bug with how dst drives are being used. I have already posted earlier to add "issues" placeholder to PSCP, as if it is not actively monitored/displayed, we don't really know what has happened, and cannot improve it. I guess, this is the biggest omission in PSCP for me.

Although, I kind of mickey mouse unboundled copy jobs from plots, so those collisions are not really causing too much harm for me as well, but triggered other lesser issues.

Actually, I am waiting for my box to finish plotting, and will add to it one NVMe dest folder, and thanks to it forgo destination folder queueing all together. It costs ~$70 (WD Black 500G), but solves the problem rather well (should bring the final copy down to ~1min, as such on 30 plots/day machine will give extra 5 hours to work on real plotting). Also, even if that is a small NVMe that can hold only 5 plots, it implies that it basically provides an hour of buffering time for those finishing plots, as such making those collisions non-existent. How those plots will be offloaded from that NVMe is rather irrelevant (to local HDs or over the network), as it will not interfere with plotting anymore.

Thank you, Jacek

Jaga-Telesin commented 3 years ago

Can you add/remove drives from it while it is working?

Sure, if they are already hot-swap drives. You can add/remove drives from the Pool quite easily, it has a great UI. I believe Stablebit Drivepool even has a trial period, in case anyone wants to test it out.

I too have recently setup a "staging drive" for finished plots to be homed on, so that plotting can continue immediately without them needing to be sent over the network first. Add a RoboCopy script to move them across, and it all seems like MacGuyver'd bliss.

Jacek-ghub commented 3 years ago

@ClementN - that is your solution then. Free time to test, and $30 per one box - less expensive than coding time put into PSCP. Plus, as you are using USB drives, those are hot-plug already (kind of).

@Jaga-Telesin, if you look at MadMax binaries, it has a 'mover" module. It just shows that chia.exe is not that well design. Still, PSCP could/should implement such staging internally, so it would work for all.

By the way, I haven't used RoboCopy for quite some time. Great software, though.

imClement commented 3 years ago

@Jacek-ghub , @Jaga-Telesin Thank you very much!

The idea of having destination folders queues instead of conditional jobs makes sense. I was thinking that the implementation of conditional jobs (one job will start automatically when other job is completed) is easier to implement.

However, the Stablebit Drivepool looks like a good solution at this stage.

imClement commented 3 years ago

Now, with the ability and (probably) the necessity of plotting for different NFTs (different pools), the conditional jobs might be useful allowing us to plan in advance different plotting sessions by setting up different jobs that will run one after another.