MrPig91 / PSChiaPlotter

A repo for powershell module that helps Chia Plotting
MIT License
181 stars 47 forks source link

[Enhancement] - "Plot While Copy" Feature Added #173

Closed kenfookchoong closed 3 years ago

kenfookchoong commented 3 years ago

remain mad max start new plot when copying

pasztig commented 3 years ago

Hi, first of all awesome job, thank you for making this app. I love this pschiaplotter with the replot feature, it is a life saver. Yes, it seems the following " -w, --waitforcopy Wait for copy to start next plot " is applied by default for madmax plotter. Is there a way to use it without this -w option? I would like to start the next plot immediately without waiting for the copy to finish. It can improve the time / plot by around 10-15 minutes. If there is a way to add a tick box on the GUI somewhere for madmax plotter that would be great. Thanks

MadCoderOne332 commented 3 years ago

agreed and thanks for the application, will definately donate

MrPig91 commented 3 years ago

Hey everyone, So the -w flag is not actually being used in PSChiaPlotter, however it is creating plots 1 by 1 instead of -n $TotalPlotCount. So this is why it copies the files over before starting a new one. In order to get stats for the log file to get the progress I have to create these plots one by one. I am thinking that I will add a "Mover" option that will automatically copy final plots to their final destination. The only thing you would need to change is have your final destination as your temp location so that it just renames the plot files instead of copying them, from there the "Mover" option will take care of copying the plot to its actual final destination. This will effectively be the same thing and could be used with the OG plotter as well. I will start working on it tonight after work, but I cannot guarantee when it will be finished or if my plan in my head will work.

MadCoderOne332 commented 3 years ago

SUPER!!!!!!!!!!!

On Thu, Jul 15, 2021 at 11:05 AM Syrius Cleveland @.***> wrote:

Hey everyone, So the -w flag is not actually being used in PSChiaPlotter, however it is creating plots 1 by 1 instead of -n $TotalPlotCount. So this is why it copies the files over before starting a new one. In order to get stats for the log file to get the progress I have to create these plots one by one. I am thinking that I will add a "Mover" option that will automatically copy final plots to their final destination. The only thing you would need to change is have your final destination as your temp location so that it just renames the plot files instead of copying them, from there the "Mover" option will take care of copying the plot to its actual final destination. This will effectively be the same thing and could be used with the OG plotter as well. I will start working on it tonight after work, but I cannot guaranteed when it will be finished or if my plan in my head will work.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/MrPig91/PSChiaPlotter/issues/173#issuecomment-880770188, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATOYDDJU7TETRPOAXBRKWVDTX32J7ANCNFSM5ALS6JYQ .

pasztig commented 3 years ago

Great, thanks for the quick response. Will this work with replot feature as well?

MrPig91 commented 3 years ago

Thanks for bringing that up, I will make sure that it will be able to work with the replot feature as this feature would not work without modification with the Mover Option.

Jacek-ghub commented 3 years ago

Uhm, I have a problem with that "Mover" solution. Don't take me wrong, I do have my own batch file mover, but I always run MM with n>1 (without PSCP). However, as I expressed before, I am all for having PSCP run MM with n>1 (redesigning the logging). To me, that is the deal breaker right now to using or recommending PSCP.

In my case (Intel i9-10900 - non-K version, also undervolted), plotting times are around 31-33 minutes per plot. My final destination is one more NVMe drive, so MM takes about a minute or two to copy the final plot there, and my mover takes over from that point. There are plenty of people that are running completely from RAM, and/or having way more than 10 physical cores that my CPU has, and the problem is much more pronounced for them.

However, when running MM with n=1, that one/two minutes of copying (whether to yet another NVMe or to the same NVMe that temp1 points to), at the end of the day makes about an hour of just copying time (for me, potentially two/three hours for more powerful boxes), i.e., not plotting. Thus, you can potentially have one/two more plots per day with my setup (i.e., less wasted power).

So, I understand that this is a good temporary solution for those that point to a HD as their final destination (as that is a disaster setup), but that is not the best what PSCP can do.

Also, if this "temporary" solution is to be implemented, I think that it can be done completely behind the scenes. PSCP should be smart enough to understand that the final destination is not SSD/NVMe, as such, should internally specify -d == -t, and do the moving quietly. This way, no end-user interaction would be needed to take advantage of that potentially a big speedup in PSCP handling MM.

By the way, I don't think -w is the default value, rather to the contrary. That is only one of three options that doesn't have an argument, as such it only kicks in when it is specified. Also, I see in my logs that the final copy is running while the next plot is already crunching data, and I am not touching this param.

MrPig91 commented 3 years ago

I am tad bit confused because it would be effectively the same thing as what MM does in the end. If the tmp directory equals the final directory then MM should just rename the file from .tmp.plot to .plot. The same second this happens the PSChiaPlotter mover would start to move it to the destination folder you normally would put of the final destination (in your case your NVMe) and PSChiaPlotter will start a new MM process. No time is lost in comparison to using MM method with N > 1.

You are right -w is not the default action, but creating plots 1 by 1 has the same behavior as -w.

Jacek-ghub commented 3 years ago

I am not sure, whether MM is using the system "move" or rather implements its own move (manual copy byte after byte). If they use system move, than it works as you described, and all my comments are just garbage rambling. I will run a test in a few hours, to check it, and will let you know. I have to say that i would love it to work the way you said, though (one less NVMe write, about 20%, when temp2 is a RAM drive).

Yeah, when n=1, there is no second plotter starting, so on my book that -w option is moot. But, I can go with your interpretation :)

pasztig commented 3 years ago

I think PSCP is quite complicated (OG plotting, MadMax plotting, logs, queues, etc...) so I think the only viable and "easy" solution would be to implement the mover as a "separate" addition. This could be used in many ways whatever the user require. I think the following features what the mover should do at least: -replot feature (the original way you have implemented is ok) -use .TEMP extension then replace to .plot (to avoid incorrect plot file report from chia gui) -adjustable time for recheck new plots (like robocopy does) I am not sure what else needed for me these are the most important.

I am not sure if I can queue MadMax plotter jobs I think it's not possible. Start one job for hdd1 then do another job for hdd2 etc. Once hdd1 job finished it starts the hdd2 job, etc... Do you plan to add that feature? With that this PSCP would be the ultimate set it and forget it chia plotter solution.

Jacek-ghub commented 3 years ago

@pasztig

Your point about PSCP being quite complicated is valid (IMO), and @MrPig91 is doing his best to simplify it. Therefore, it has already been asked to burry the mover in the plumbing, so the end users would not know it is there.

I guess, it may be easier to explain what you would like to do in general, than trying to address one issue at a time, as there may be already ways to accommodate your needs.

As far as queueing those MM jobs, PSCP has an option to provide multiple destinations. If you do that, it will assign each new job to a different destination (kind of). I think it addresses your problem rather well.

Jacek-ghub commented 3 years ago

@MrPig91

You were right!!! I am really happy to report that MM is doing move, not copy for the final drive. I guess, I looked at the log output, and assumed that what they wrote there (copying) is how they implemented it (also copying), and have never bothered to check it out.

So, checking again what you initially wrote, I would just keep the UI the way you have it right now, and just change the plumbing to use the tmp1 value as the destination as well, and kick off the mover to finish the job to the provided through Advanced/Basic UI destination folders. No change in the end-user behavior, but superior performance improvement.

I have to say that I cannot wait to see it implemented!

Again, sorry for my rambling up there. I am really happy that you checked me on that one.

MrPig91 commented 3 years ago

So I think I kind of met in the middle of having the "plot while copying" simplified to use but still able to be turned off. I have added a CheckBox called "Plot While Copy" which will allow for a new plotting process to start while the previous one copies. I would still like to add a "Mover" feature that could be used separately for those who still move their plots around after their final destination, but given that the plot while copy feature is in very high demand I decided to focus on that and make it simple for the end user. I could have added it in the plumbing as @Jacek-ghub suggested where PSChiaPlotter do it automatically but it could have unintended consequences for those who are tight on space since you are leaving 100 GB on the tmp drive as it is being copied over.

I have not pushed the update out yet since I am still doing tests to make sure it works with many configurations. Including having no 2nd temp drive selected, using advanced plotting vs. basic plotting, using MM and OG plotting, and with the replot feature enabled.

MrPig91 commented 3 years ago

Does MadMax Copy from the tmp2 or the tmp1 directory? I think the tmp2 right? I wish MM could allow you to plot different K-Sizes so I could test with K25.

Jaga-Telesin commented 3 years ago

Temp2 is heavy for plotting, Temp1 is for assembly (only 25% of total writes). Final is assembled on Temp1, then copy happens to final Dir.

Edited, since I had the labels reversed. :)

MrPig91 commented 3 years ago

Perfect, just wanted to make 100% sure. I can test the OG plotter really fast with K25 plots, but MM takes around 1 hr 30 min for me to test. Thank you for verifying that information for me.

Jacek-ghub commented 3 years ago

I think that the final plot ends up in Temp1, at least that is what I see with MM. I also think that Temp2 takes about 75% of writes (therefore it can be RAM backed). Also, Temp2 uses only 110GB, as such will not have the room for some temp files plus the final plot. Although, I know that @Jaga-Telesin corrected me few times already, so I am not that sure about it anymore.

So, we are at 50% sure for now ;)

Jaga-Telesin commented 3 years ago

You're right, I had them reversed. Goes to show you what happens when you stop plotting for a couple of weeks. ;) But the logic is the same, just switch the labels.

MrPig91 commented 3 years ago

Haha dang so it is reversed from what the OG plotter does then. Okay, I need to update this feature when using MM vs OG plotter. Thank you guys!

Jaga-Telesin commented 3 years ago

Max should have done it in order, but for some reason the primary plot drive is #2, and the secondary (optional one) is #1. But you've got a handle on it now. :D

Jacek-ghub commented 3 years ago

@MrPig91

So, what is your target audience? Is that like most of those people that leave some comments on this forum, and are a bit confused looking for solutions (a lot of them, me included). Or rather highly skilled pros with petabytes of plots (maybe a handful of those people). I think it is the first group that needs to have just few clicks plotting solution, as the second group will easily understand what is needed, and how to throw HW against some issues.

With that in mind.

I have added a CheckBox called "Plot While Copy"

That checkbox will be checked by default, is that right? (Yes, I am also from that confused group and prefer an idiot proof solution.)

I would still like to add a "Mover" feature

Whatever is not integrated into PSCP is just some icing on the cake. I would wait with that to get the PSCP process ironed out first.

unintended consequences for those who are tight on space since you are leaving 100 GB

First, MM is not chia.exe, as such I think most of us is running just one instance at a time. Saying that, Temp1 requires ~250GB of free space, and it is not that heavily used at the beginning of a plotting process (still, moving can take a long time, though). With that in mind PSCP should be able to check on free space, and let the user know that if more space free will be on Temp1 folder, the speed gains could be significant (so the user can back off, and make adjustments). Also, that $250GB requirements makes the minimum SSD size of 500GB, and that should cover the issue. Also, even though MM is requesting 250GB for Temp1 and 110GB for Temp2, the combined peak is only 250GB - so again that 500GB drive will do to hold both temp folders plus the extra plot. Actually, MM when running with -n>1 is doing copying while starting the next plotting process, that implies that PSCP can do exactly the same - no worries about the disk space.

Yeah, I am not using chia.exe to plot as MM is really superior to it, so cannot comment on that one.

MrPig91 commented 3 years ago

My target audience is the first one you mentioned in the beginning and for the most part now as well. I wanted to create a plot manager that people could use outside of the official chia GUI, but for people not comfortable with the command line or powershell. My main goal was to create a plot manager that wasn't the most efficient, but where the average Windows user would be able to improve their plotting times with very little effort compared to plotting in the chia GUI. This is why I had so many "Safety" features in the beginning so that you could mess up on settings but PSCP would "catch" those mistakes in a way by preventing starting another plotting process unless there was enough space. After the first release I got quite a few request to have those "Safety" features to be disabled so I added checkboxes that would turn most of them off. I feel like this made PSCP more confusing for the end user since there are so many checkboxes that do not have clear definitions of what they do. I really should write up a user guide (something I will probably not have time for anytime soon).

Obviously a lot has changed since the introduction of MM which changed how everyone plotted overnight and made it where most average users could plot very efficiently without worrying about parallel plotting or other parameters like RAM. I honestly was surprised many people wanted to continue to use PSCP after MM came out, but people seemed interested so I integrated it into the PSCP. I really don't know how many people are using MM vs the OG plotter any more. I use both just for fun.

Yeah I completely agree with the temp size calculations need to be redone. I really felt like that was a large part of PSCP when it first came out but was not perfected enough to work properly. I really need to rework that part of the PSCP.

To be completely honest though I know now that PSCP will never be what I fully envisioned when I first set out to make it, mainly due to me underestimating how much time it would take up and the number of features that would be requested. Also plotting will become a dying part of chia once everyone is plotted and netspace will discourage new comers from joining so expanding upon the PSCP feels less rewarding since eventually less and less people will use it once they finish plotting.

Also I start to get bored with projects and want to start new ones, and PSCP is by far the project I have worked on the most (both in time and effort). The next projects I want to work on is making powershell modules for FlexPool API and SpacePool API. Those should be relatively easy though.

Jacek-ghub commented 3 years ago

Again, you did really a great job with PSCP. Being a single person on the project and working just part time, you basically KOed the official UI. And this is again, not just PSCP, but also all those supporting features (e.g., I use those heat maps all the time, and thanks to them reported a potential bug with the official v1.2.1/2 releases - lookup timeouts got increased a lot).

Yeah, no one thought that MM will deliver such a blow to the OG plotter. As you mentioned, it simplified few things here and there, but exactly as you stated - people want a nice GUI with just few clicks. So, maybe you should focus more on MM side, and deemphasize OG plotter support (as that is where all those complications are). I would not mind seeing a startup message encouraging people to switch to MM plotter to get better results - some people are new to chia, and the official software is all that they know.

Maybe a bit more activity on MM github space would be beneficial (e.g., just ask that question about test plots there), to let more people know about PSCP. Maybe you could ping Max, and work with him on the upcoming features. I think that he would appreciate a nice tool that makes his plotter shine.

Don't get discouraged with this project. This is rather normal part of software development - the real core takes about 10-20% of work, but the tedious part (dealing with all kind of error conditions) takes the rest. Just stay focus on what can be really helpful for PSCP (have those people from the first group in mind), as otherwise it is just a mental diarrhea chasing all the requests people have.

MrPig91 commented 3 years ago

I really appreciated those kind words and I am happy with how far PSCP has come along with the rest of the tools and extremely pleased with how many use them despite how many plot managers and tools are out there.

MM is definitely the way forward and I do agree that new updates to PSCP should focus on that plotter rather then the OG plotter.

The last I read from MM about different KSize is that he will not implement them until the min KSize changes from k32 since it is a lot work apparently to update.

Yeah, you are definitely right about the work flow of software development. The exciting part is the beginning when you have all the ideas still in your head and your head in the sky. Ironing everything out in the real world is the part that you just have to keep working on and make steady progress.

MrPig91 commented 3 years ago

I decided to release the newest version with the "Plot While Copy" feature added. Also found a bug when overwriting saved jobs, looks like it overwrote the job regardless if you hit yes or no. I am doing another MM test since I had it backward before but I think I fixed the code by reversing what I had.

Jacek-ghub commented 3 years ago

The last I read from MM about different KSize is that he will not implement them until the min KSize changes from k32 since it is a lot work apparently to update.

That is not what I meant to say ;) Yes, the question is trivial, and the answer is know. However, the main point to ask that question is to be there in his face, to have others check out your plot manager. His Issues page is quite active, so you never know what will happen.

pasztig commented 3 years ago

Morning, first of all thanks for taking the time to do this. I have just trying the new version. I have just run out of space and I do replots. I have a plotter pc and a farmer and harvester on another pc. I am using shared folders for destination drives now. I haven't got issues to use shared folder for destination as normal plotting but now as I am repotting I am receiving this error message: A parameter cannot be found that matches parameter name: 'LogType' also Failed to grab volume info. Is this known issue? Can I use shared folders for replot?

MadCoderOne332 commented 3 years ago

XCH sent as thanks

pasztig commented 3 years ago

pschiaplotter network replot

Morning, first of all thanks for taking the time to do this. I have just trying the new version. I have just run out of space and I do replots. I have a plotter pc and a farmer and harvester on another pc. I am using shared folders for destination drives now. I haven't got issues to use shared folder for destination as normal plotting but now as I am repotting I am receiving this error message: A parameter cannot be found that matches parameter name: 'LogType' also Failed to grab volume info. Is this known issue? Can I use shared folders for replot?

I have just tried again. I have mapped a full shared drive with folders Plots and PoolPlots as W: and managed to select as a drive with old (Plots) and new (PoolPlots) folders. It started now without issues so it seems working. Will see at the end, takes 28 minutes to finish. I will report later wit result.

MrPig91 commented 3 years ago

@MadCoderOne332 You are very kind thank you!

MrPig91 commented 3 years ago

@pasztig I have found the line that was causing the process to fail. I was calling the wrong function with the wrong parameters in the catch block, since it was in a catch block it caused the entire queue to fail. Also since it was in a catch block it means that an error was still happening when grabbing the volume information.