Chia-Network / chiapos

Chia Proof of Space library
Apache License 2.0
267 stars 281 forks source link

Copying tmp file to the final dir where a move/rename would suffice #221

Closed mjsr closed 3 years ago

mjsr commented 3 years ago

Description

There appears to be an issue regarding how the plot is moved to the final directory. In the logs below, it appears that the condition tmp_2_filename.parent_path() == final_filename.parent_path() evaluated to false (link to code). This led to almost 1h copying a file that could simply have been moved.

My understanding is that this works across operating systems (specifically interop between Windows and *nix based OSes).

Current behavior

Final file copied from /home/[redacted]/tmp to /home/[redacted]/plots.

Expected behavior

Unclear if this needs per OS code but I would suspect that in *nix based OSes, we could always move the file (maybe not, I'm unsure). The higher effort alternative would be to understand if the different paths belong to the same mount point/device. For Windows, my understand is that we could compare the first token of fs::path (not certain) to see if the drive letters match.

Logs:

lot-hdd.log-Copied final file from "/home/[redacted]/tmp/plot-k32-2021-04-27-04-24-[redacted].plot.2.tmp" to "/home/[redacted]/plots/plot-k32-2021-04-27-04-24-[redacted].plot.2.tmp"
plot-hdd.log-Copy time = 3150.589 seconds. CPU (7.240%) Tue Apr 27 17:23:05 2021
plot-hdd.log-Removed temp2 file "/home/[redacted]/tmp/plot-k32-2021-04-27-04-24-[redacted].plot.2.tmp"? 1
plot-hdd.log:Renamed final file from "/home/[redacted]/plots/plot-k32-2021-04-27-04-24-[redacted].plot.2.tmp" to "/home/pedro/plots/plot-k32-2021-04-27-04-24-[redacted].plot"
meawoppl commented 3 years ago

I think a lot of people use SSD's for computation, then spinning disk for final storage. IMHO this choice above makes sense...

room101-dev commented 3 years ago

Yes, its best to make a permanent copy of the new plot on HDD, Let's say you do the rename/move, and it fails, now you just lost your plot, on the other hand if your copy, and have a problem like 'lack of target space', you still have your original. Given it takes 6-8 hours to make a plot, its not worth saving 10 minutes of copy time to have back-up insurance.

mjsr commented 3 years ago

The main point of my initial post is that a move would make more sense than a copy.

Personally, my destination directory is another SSD which cuts the transfer in phase 5 (?) to less than half. When you make more than 30 plots a day, that's significant time saved. Still, YMMV.

room101-dev commented 3 years ago

Here's how its done

-t is an NVME drive should be 1TB so you can do 4x plots at a time -2 is a burned-out nvme drive from stage (-t), or you can use a normal sata-ssd drive, note that it should be > 512mb for safety of your 4 contiguous plots -d is an HDD magnetic drive, note that SSD drives are not safe or permanent, and certainly they're temp drives -t & -2 above, why in the hell would you put a 'plot' on a sata-ssd? If you wanted to keep it forever for mining? Why would you want move 108gb around many times which takes 10-20min on most HDD's???

Permanent data should not be stored on these devices as shown, they can & will die at anytime; Lots of people here advocate buying data center quality hw, used which by definition means it has already out of warranty and arrived at end of useful life. The DC hw new is out of reach for most people as its very expensive. I use samsung evo 980's, but I also have corsair's that I plan on using once all my samsung nvme's are dead. The problem is now everywhere in ASIA they 1TB nvme's are almost out of stock, I had to order the corsair NVME's from amazon in USA.

The purpose of -2 is to take a load off the -t drive while plotting. The -d is really up to you, IMHO minimal copy's is critical as the slowest thing in the chain of event's is that last copy, which should by definition be permanent, as in DONE.

On the subject early on I tried using -d on network lan samba drives it was a mess, chia-pos couldn't handle so now I keep all on a local machine, and as my 12tb,14tb,16tb,18tb drives fill-up I pull them out and put them in i-3 minnig rigs with 4gb of memory and the harvest just fine with HPOOL. I also harvest the plots with chia-net, but as you all known the ROI is zero, ever since the space went above 1EB now its 6.9EB, so its slowed down a lot; this happens when the price finally collapses, and the chinese are beginning to sour on the chia-team in the USA; Remember it was the chinese IOU's that sent the price to $1600, and it was the Chinese that developed software harvesters & plotter's that actually worked and working pools. The chia-dev team has delivered nothing but buggy software. ...

The current system does well the -t handles the random r/w for creating the plot, the -2 keeps the intermediate final of the plot, so that the -t doesn't have to lose storage/working space, -d is the final resting place for the plot. Think of -d as a cemetery.

Now you can do as you wish, but you telling me the final resting for your plot is an SSD tells me you must only have a few plots, as most in this game are using 16TB hdd so they can hold 150 plots

emlowe commented 3 years ago

Closing issue