ericaltendorf / plotman

Chia plotting manager
Apache License 2.0
909 stars 280 forks source link

Increasing archive throughput #901

Open jayhohoho2019 opened 3 years ago

jayhohoho2019 commented 3 years ago

Discussed in https://github.com/ericaltendorf/plotman/discussions/900

Originally posted by **jayhohoho2019** August 8, 2021 Hello, Which parameter if any is for archive polling period? 1 minute believe it or not is getting too long now for me. Thanks.

In addition to making the archive mode polling period configurable, could we allow running multiple rsync in parallel? I'm referring to the local_rsync mode only at this point.

My issue is that, with a fast plotter in use, it now takes longer to rsync a plot from my dst drive (NVME SSD) to the final archive HDD than to create a plot and save it to the dst drive, so the dst drive fills up after a while. Running 2 rsync to 2 archive HDDs would solve this.

altendky commented 3 years ago

It is a thing that has been discussed but I can't say I've got any schedule for implementing it.

jayhohoho2019 commented 3 years ago

Would it be possible to run two instances of plotman archive each with its own config.yaml, and a slightly different target definition?

altendky commented 3 years ago

Yes, sorry I didn't think more to mention that. It detects based on the site root so if you just mount to two different directories that would cut it. I don't think we actually have a configuration path override though... Definitely a missing feature. I should also have mentioned https://github.com/rjsears/chia_plot_manager. The author uses plotman to plot and their own tooling to do "higher end" archiving (plus whatever other features it has).

jayhohoho2019 commented 3 years ago

Thanks for the info. I just need to double the plotman archive throughput at this point so will take a look at the chia plot manager later. I am able to run two instances of plotman archive now each working off a different tmp dir (dst drive), and to a different site root.

jayhohoho2019 commented 3 years ago

Yes it'll be more convenient to allow config file override.

jayhohoho2019 commented 3 years ago

Actually I think there is some issue there. When 1 instance is running rsync, the other instance doesn't start rsync. But perhaps since I changed the archive sleep time to 10s, each instance from time to time starts rsync around the same minute. I recall the archive.py code is actually checking for the transfer script name and argument list. Any suggestions?

altendky commented 3 years ago

plotman does check for the site_root or URL to be in the options of existing rsync processes. Part of the suggestion was to have the different drives under a different site_root so they wouldn't detect rsync processes from the other plotman archiving instance.

To be clear, yes, we are talking about an annoying hacky way to get to what you want (sort of). I'm not suggesting this is a good way for plotman to work.

jayhohoho2019 commented 3 years ago

That's exactly how I set this up. Two config files with different site_roots (and different buffer drive paths, and different log directories). The inconvenience is I had to copy the the desired config file to the only location plotman is looking before starting that plotman instance, but the real problem is, it seems, plotman still detects the rsync process run by the other instance, most of the time. This test seems to only fail when the other rsync is started within the same minute. Therefore, I either only see 1 rsync running (most of the time), or see 2 rsync processes that are started at the same hour and minute. Is the code checking for the command_name (rsync in both instances) AND site_root (different by 1 character in both instances)?

So with this setup and the problem, I have increased my archive throughput but by nowhere close to doubling it. The buffer drive Use% is still growing, although at a much slower rate than when only 1 instance was running.

altendky commented 3 years ago

https://github.com/ericaltendorf/plotman/blob/77f85e3b38abc45d8cb6af3bad650b458c3aef35/src/plotman/archive.py#L191-L195

What are the actual site roots? Is one just the other one plus a character? Perhaps just share both complete config config files.

jayhohoho2019 commented 3 years ago

Yes. The second site_root is first $site_root}1. So it's getting a partial match by using startwith I suppose.

jayhohoho2019 commented 3 years ago

Is there a function that does exact match? Or I can change site_root to something like appending 2 to it I suppose.

altendky commented 3 years ago

Yeah, for now, making it so that neither starts with the other seems best. I'm sure the code could change as well.

jayhohoho2019 commented 3 years ago

Ok. So after naming 2 site_roots neither of which starting with the other, both plotman archive instances are firing up 1 rsync regardless of the other instance. To summarize I guess a few things are good to have:

  1. allow config file override
  2. make archive mode polling period configurable, and
  3. make string match exact in the test for transfer script dest

FYI I'm using plotman archive only and it's been working well.