rjsears / chia_plot_manager

Python scripts to manage Chia plots and drive space, providing full reports. Also monitors the number of chia coins you have. Auto Drive helps to automate the addition of new hard drives to your system and to the chia config.
MIT License
334 stars 52 forks source link

Question about speed that doesn't make sense #66

Closed thomascooper closed 1 year ago

thomascooper commented 3 years ago

This is more than likely a research assistance request. When I have ran out of space on any plotting director (not temp) or when a plot is transferring from temp to plotting my speed of ncat drops significantly, from 110M/S to sometimes < 10M/S. This seems like an issue that shouldn't exist due to the hardware layout. Here is an example of my NAS and Plotter:

NAS: 24 CPU 100 GB Ram LSI 9205-8e connected to 3x NetApp DS4243 72 Drives

Plotter 24 CPU 100GB Ram Plot Dir (/mnt/hdd/hdd0) 2 TB WD Blue SSD Temp Dir (/mnt/nvme_drive0) Raid 0 2x 1TB Samsung 970

I am using plotman with MadMax, any time a plot is in 4:1 (transferring from temp to plot dir) I see the drop in speed. This is even worse if the temp dir fills up and the 1 or 2 plots sit in 4:1 for an extended period of time.

I am trying to identify why this is happening. This didn't seem like this was an issue before I moved from .92 to .95 but I can't be sure.

I am looking for some help trying to figure this out, I have a plotter with a smallish plot dir and if it happens to fill up then transfers off the plotter start taking 10x as long and thus the entire group of plotters end up waiting in line and getting backed up.

rjsears commented 3 years ago

So the only time I have see is when I am attempting to copy (then delete) a plot from my final directory. I use madmax with 110G in memory for each of my 4 x plots that I am creating on my plotter. I use two NVMe drives as my -t and -d drives under madmax. I then run plot_manager.py against my -d drive to move those plots off to my harvesters.

When madmax is NOT moving a plot from -t to -d, I can transfer a plot in about 4-5 minutes (assuming that I have all four CPUs (96 cores) maxed at 100%). If MadMax is moving a plot to -d while I am attempting to also move a plot off the plotter (happens a lot), then that number doubles easy to 10 minutes.

I have not actually clocked the actual transfer rate, but give the load on the overall system I am still quite happy with 10 minutes. With stripped-down rsync against rsyncd, this is around 20 to 25 minutes.

I did make a change from nc to ncat but only after a lot of testing showed about a 10% increase in speed across all of my systems. ncat is simply an optimized version of nc done by nmap. If you want, you can roll back to nc and see if the problem resolves itself, but I personally have four farms running the ncat without seeing this particular issue but I just don't know of anything else that I changed that might cause that kind of an issue.