ericaltendorf / plotman

Chia plotting manager
Apache License 2.0
910 stars 280 forks source link

Allow Randomization of Archive Drive Destinations #856

Open MikeKroell opened 3 years ago

MikeKroell commented 3 years ago

When using multiple Plotters, Plotman will constantly use the same archive drive across all plotters resulting in a dramatic slowdown of transfers and writes due to non-sequential writes to the drive over sequential. This happens even with just two plotters.

Now I am not a coder, but here is how I modified the code to make this work for me. Note I did remove some of the checks of the config for this and not sure exactly what other effects it has but it is working for me:

archive.py

    if len(available) > 0:
        index = random.randrange(len(available) - 1)
        #Original Code:
        #index = min(arch_cfg.index, len(available) - 1)
        (archdir, freespace) = sorted(available)[index]
        print(f'Selected {archdir} for archive destination')

This ensures that the majority of the time plotters will be using separate drives. The more drives with available space, the more diverse. They will occasionally hit the same drive, but unless we write some type of temp files for Plotman to read as a marker, I'm not sure how to keep multiple plotters aware of other's writes vs a write that might have not completed.

altendky commented 3 years ago

Why not just use the index feature as is and specify a different number for each plotter? Though there is a change related to that in https://github.com/ericaltendorf/plotman/pull/855 based on discussion in https://github.com/ericaltendorf/plotman/issues/847. Also, in cases of "extreme" archiving needs you may be interested in https://github.com/rjsears/chia_plot_manager.

MikeKroell commented 3 years ago

I must have missed the index feature, where was that?

altendky commented 3 years ago

Hidden somewhere in the code probably... yeah, our documentation needs work, sorry.

archiving:
  index: 0

Just set it to a different number on each plotter, counting up from 0. This doesn't cover all cases and might become fancier as you suggested by checking for rsync tmp files or other, but it does do a useful thing. When you get down to fewer drives than you have plotters then yeah, there's hazard of collisions.