madMAx43v3r / chia-plotter

Apache License 2.0
2.27k stars 662 forks source link

Default final destination should also toggle when -G option is used #584

Closed Qwinn1 closed 3 years ago

Qwinn1 commented 3 years ago

First, thank you for this fantastic plotter! Increased my time even over the optimized pechy version of chiapos by almost 50%!

Now, my issue/request: When you do chia-plotter --help, it says: "-d, --finaldir arg Final directory (default = tmpdir)", but if you don't use the -d option, leaving it to default to tmpdir, and use the -G toggle to alternate tmpdir and tmpdir2 with each run, the final destination does not stay consistent with the currently running tmpdir, instead it stays as whatever the original tmpdir was.

The reason this matters is, I'm running tmpdir on one NVME drive, and tmpdir2 on a separate NVME drive for better throughput. I toggle the tmpdirs so that the wear on the NVMEs is more even. But when the final copy takes place, I want the destination to be on the same device that the plot was produced (the current tmpdir), so that no actual copy needs to take place, it's just a rename. Instead, what I'm getting is that the first, third, fifth runs are kept on the same drive with an instant rename, while the second, fourth, sixth runs are forced to copy from what was my tmpdir2 device on the odd numbered runs to my odd-tmpdir device, adding a couple of minutes and unnecessary 100gb write to my every other plot. I have cron jobs set up to properly archive from both tmpdir and tmpdir2 when plots are found there, and everything would work beautifully if plots created when tmpdir2 alternated to become the tmpdir would stay where they are for the final copy, but the way it is, the completed plots are always forced to wind up in the original tmpdir.

Thanks for any help!

number435398 commented 3 years ago

I think you should merge this with my issue or at least support the issue I already raised at https://github.com/madMAx43v3r/chia-plotter/issues/547. Or at least this partially relates to my issue.

cyperbg commented 3 years ago

Hi, do you mind sharing your crontab script/settings for manually moving the plots from tmpdir to final disk?

Cheers

SebMoore commented 3 years ago

I think you should merge this with my issue or at least support the issue I already raised at #547. Or at least this partially relates to my issue.

Your issue is similar, but not the same: yours is applicable for anyone who uses any combination of temp disks and just wants multiple dest dirs. This issue specifically relates to people who use two ssds as their temp drives. This issue requires modification of an existing option (the drive swap option). You issue requires a totally new option. It's good that we've got these both as separate issues :)

number435398 commented 3 years ago

I think you should merge this with my issue or at least support the issue I already raised at #547. Or at least this partially relates to my issue.

Your issue is similar, but not the same: yours is applicable for anyone who uses any combination of temp disks and just wants multiple dest dirs. This issue specifically relates to people who use two ssds as their temp drives. This issue requires modification of an existing option (the drive swap option). You issue requires a totally new option. It's good that we've got these both as separate issues :)

Having an option that would alternate between destinations (as I'm advocating) would produce the same result as it would alternate between the destinations at the same time it was alternating between tempdirs.

SebMoore commented 3 years ago

I think you should merge this with my issue or at least support the issue I already raised at #547. Or at least this partially relates to my issue.

Your issue is similar, but not the same: yours is applicable for anyone who uses any combination of temp disks and just wants multiple dest dirs. This issue specifically relates to people who use two ssds as their temp drives. This issue requires modification of an existing option (the drive swap option). You issue requires a totally new option. It's good that we've got these both as separate issues :)

Having an option that would alternate between destinations (as I'm advocating) would produce the same result as it would alternate between the destinations at the same time it was alternating between tempdirs.

Correct. However, this is only useful for people that have the ability to swap between tempdirs. Many people here are using an SSD for tmp1 and a ramdisk for tmp2. Therefore, they can't swap between tempdirs every plot because it would ruin their times. Therefore, what they need is an option independent of the tempdir swap option, that allows for multiple final directories. This issue relates specifically to tempdir swap functionality, while your issue is more general: yours covers the general idea of a "multiple final dir" option, independent of the tempdir swap option. That's why these issues are different, and why it's beneficial to keep them separate.

Qwinn1 commented 3 years ago

I think you should merge this with my issue or at least support the issue I already raised at #547. Or at least this partially relates to my issue.

Your issue is similar, but not the same: yours is applicable for anyone who uses any combination of temp disks and just wants multiple dest dirs. This issue specifically relates to people who use two ssds as their temp drives. This issue requires modification of an existing option (the drive swap option). You issue requires a totally new option. It's good that we've got these both as separate issues :)

Having an option that would alternate between destinations (as I'm advocating) would produce the same result as it would alternate between the destinations at the same time it was alternating between tempdirs.

Correct. However, this is only useful for people that have the ability to swap between tempdirs. Many people here are using an SSD for tmp1 and a ramdisk for tmp2. Therefore, they can't swap between tempdirs every plot because it would ruin their times. Therefore, what they need is an option independent of the tempdir swap option, that allows for multiple final directories. This issue relates specifically to tempdir swap functionality, while your issue is more general: yours covers the general idea of a "multiple final dir" option, independent of the tempdir swap option. That's why these issues are different, and why it's beneficial to keep them separate.

I agree with VertiHydro. In addition to his points, number435398, your request would require adding yet another parameter to the command line arguments, while mine would not. That could involve significantly greater effort for Max to implement and test. Mine could be implemented by just adding a single line of code at the beginning of each iteration that sets the destination dir to be the current tmpdir right after the code that implements the -G toggle (and if -G wasn't set, then it'll have no effect, because the final dir will always be set to the original tmpdir, just like now). That's 1 line of code that I am arguing logically should be there at least to make behavior conform with the current documentation of -d and -G, or at least, the current documentation is ambiguous about the effect of using -G with the default -d, and I argue mine satisfies a widespread practical need and is more logically consistent anyway (upon reading the current documentation, I thought the final destination WOULD also toggle too, just seemed a more intuitive reading to me, I was surprised when it didn't). Yours probably would require quite a few more lines of code, and testing, and Max might very well decide he's willing to put in his valuable time for the quick and easy one and not the harder one. I hope you get your functionality too, sure, and it WOULD also solve my problem if he wants to go that extra mile, but I'm not ready to subject my simpler request to being packaged with yours, both or neither.

Qwinn1 commented 3 years ago

This issue just became massively more important to me. The fact that on the even-numbered runs, the plotter is creating a copy of the same plot in the two plotter directories without renaming them to something other than .plot, and my cron jobs are scanning those two folders to archive, means I am getting multiple copies of the same plots on my HDDs since yesterday. Implementing my one line fix as requested would solve this problem.... have to come up with some new archival method that won't create duplicate plots in the meantime, or turn off the -G toggle.

number435398 commented 3 years ago

Could you manually start two different terminal windows, scatter start them, and have them run "simultaneously"? But each one with a different set of folders? It seems like that might work for your particular issue.

Qwinn1 commented 3 years ago

Thanks for the suggestion, but not really interested in running plots in parallel in madmax - I'm getting consistent 30 minute completion times sequential and that's plenty good enough for me without introducing the complexities of parallel plots.

I just basically have to specify a -d parameter to get the plots in a staging directory for now when using the -G toggle so I don't get two copies of the same .plot existing in two directories at once, even if only during the length of the copy.

Max, I would also suggest (as I believe I saw someone else do in another thread) doing what the plotman plotter does and renaming a plot from .plot to dot something else during the copy period and renaming them back to .plot upon completion of the move, so plots being moved don't look like two distinct complete plots to the filesystem and worse, the farmer, during that period. The farmer has a habit of finding staging directories on its own and adding them to its config.yaml, I've noticed.

Qwinn1 commented 3 years ago

Crap. Even that won't work, I don't think. As long as the file has a .plot extension and the filename is indistinguishable from a normal plot while it's being copied/moved from one device to another by the plotter, and thus existing in two places at once (and possibly getting renamed/moved by an archiving system during that copy/move) , I don't think there's any way to create a proper archival procedure. A rename during the madmax plotter's copy procedure to something other than *.plot for the duration of the copy/move seems maximally critical to me. Someone else has to have reported this by now, I'll try to find an existing issue for it and add my support there.

Hmmm.... the only kludgy workaround I can come up with is to puzzle out and add some way to check the filesize as greater than 108gb before attempting to initiate an archival to all my cron jobs. But even that could possibly fail, in the unlikely event the copy/move was somewhere between 108gb and the final filesize when the archive job kicked off.

My cron jobs also check to make sure that there isn't already a "mv" process running, so that my cron jobs don't interfere with each other, but apparently the plotter's method of copying a plot to its final destination doesn't generate a "mv" process, so that check doesn't work.

number435398 commented 3 years ago

I use "mv *.plot "destination" (quotations added for the website) and it works fine. Though that's from the folder where it calls the final plot "plot.temp" first, so I don't know if that'd help. Though I do have mine re-execute "mv" after a 30 minute delay after it finishes its previous "mv", so by the time it gets to the plot file that was a "temp" file at the time of execution (after copying 4-5 other plots) that file is no longer a "temp" file. It works for me just fine for that purpose.

Qwinn1 commented 3 years ago

Perhaps my problems are mostly self-inflicted by trying to archive directly from tmpdir (by not specifying a -d). I'm working the problem.

number435398 commented 3 years ago

Yeah, I'd specify a -d. Only reason I want multiple -d options is because I'm creating plots faster than some of my destination folders can be written to without backing up the next plot creation/copying.

Qwinn1 commented 3 years ago

Okay, I think I got it working (meaning, archiving without duplicating plots) without having to specify a -d and archiving straight from my tmpdirs. I just had to add steps to my cron jobs specifically for not getting confused by the other files in those directories where part of the filename is ".plot" (like, for example, ".plot.tmp".

So, never mind my other complaints and panic - I was able to address those issues. I'm back to just hoping for my original one-line modification/fix request, of having the final destination also toggle to tmpdir2 when -G is set, at least when no -d option was specified and the default final destination should be tmpdir as specified in the documentation.

madMAx43v3r commented 3 years ago

612 should fix it

if it works we'll merge it

Qwinn1 commented 3 years ago

612 should fix it

if it works we'll merge it

Thank you Max! Hugely appreciated.

If I'm expected to test it (and I'm happy to), could someone clue me in on how to implement #612 on my system? I'm still a bit of an Ubuntu/Github noob, and my software installs are still pretty much cut and pastes from guides, so would appreciate learning me how to work that particular aspect.

madMAx43v3r commented 3 years ago

To test the branch:

cd chia-plotter
# to get latest branches (if not already done so)
git fetch
# switch branch
git checkout finaldir-toggle-fix
# make sure you are on latest commit in branch (at first you will be)
git pull
# re-compile
./make_devel.sh

To switch back to master:

cd chia-plotter
git checkout master
./make_devel.sh

It gets more complicated when submodules are updated or added, but that's not the case here.

Qwinn1 commented 3 years ago

Tested for several hours, works absolutely perfectly sir. Thank you! Both for the quick fix and the git education. Much appreciated.

madMAx43v3r commented 3 years ago

Awesome