martymac / fpart

Sort files and pack them into partitions
https://www.fpart.org/
BSD 2-Clause "Simplified" License
231 stars 39 forks source link

Set OPT_TOOL_PATH in fpsync #44

Closed biocyberman closed 1 year ago

biocyberman commented 1 year ago

The variable needs is set so job_queue_info_dump and job_queue_info_load functions will work correctly during a resume run

martymac commented 1 year ago

Hello,

Thanks for your pull request. Could you give me an example where resuming fails ? Do you have a reproducible use case ?

biocyberman commented 1 year ago

Hi Without the patch, this is the how you can reproduce the issue:

# Generate a resumable run, without setting the -T argument. For example
fpsync -p <source_path> <target_path> 
# OPT_TOOL_PATH will be empty and saved into `info` file by `job_queue_info_dump`
# Resume the run
fpsync -r <run_id> 
# job_queue_info_load will check if OPT_TOOL_PATH is absolute path and it fails and die at 
# this line: https://github.com/martymac/fpart/blob/master/tools/fpsync#L808 

Hope this help, otherwise I can try to create a re-runnable script.

martymac commented 1 year ago

Thanks for additional details, I've merged your patch!

(this is something that must have been overlooked in commit d2aea009)

Best regards,

Ganael.

biocyberman commented 1 year ago

My pleasure, Ganael. I am using this tool to migration hundreds of millions of small files. Great lessons learned during this process. I will try to write it up when I am done. Thanks for developing this tool. It saves us tons of headache.

martymac commented 1 year ago

You're welcome! And thanks for those kind words, it's always a pleasure to read that :)

Cheers,

Ganael.

gstalnaker commented 10 months ago

Gentlemen - I'm hoping you can provide me a set of steps to get passed this issue. I was in the midst of migrating 4.5Tb of data from failing HDD to new HDD. After 12+ hours, the file transfer rate seemed to drop by a substantial amount (in the 30Mb/sec rang to 2-5Mb/sec). Knowing (!!) I could resume the job, I killed fpsync so I could increase the verbosity output. Now I'm getting this on resume attempt:

[~]$ fpsync -v -n 10 -f 10000 /mnt/SeagateNTFS/ /mnt/expansionHDD/
1700947631 Info: Run ID: 1700947631-185847
1700947631 ===> Analyzing filesystem...
1700957202 <=== Fpart crawling finished
.1701027313 ===> Interrupted again, killing remaining jobs
1701027314 <=== Parts done: 629/764 (82%), remaining: 135
1701027314 <=== Time elapsed: 79683s, remaining: ~17102s (~126s/job)
1701027314 <=== Fpsync interrupted.
...
[~]$ ~/bin/fpsync.sh -r 1700947631-185847 
Invalid option value loaded from resumed run: OPT_TOOL_PATH

And I cannot resume the job.

This started with the repository installed fpsync. Last run above is the newest fpsync code I've downloaded from https://github.com/martymac/fpart/blob/master/tools/fpsync and put in my own local $HOME/bin folder as fpsync.sh. I've tried to manually export a ENV thus:

[~]$ export OPT_TOOL_PATH='/usr/bin/'
[~]$ ~/bin/fpsync.sh -r 1700947631-185847
Invalid option value loaded from resumed run: OPT_TOOL_PATH

Which as you can see did not work. I confirm that fpart, rsync, cpio, and tar are in /usr/bin/

I'm 18% from finishing this 4.5Tb data transfer, so still a lot more to go and a certainly do not want to start over!

gstalnaker commented 10 months ago

Heh! Nevermind. With the comment from biocyberman on Jan 11 I was able to search the code and eventually discover the /tmp/ folder, the job folders, and the info file. I added the path there for the variable and I've now successfully restarted the file transfer.

gstalnaker commented 10 months ago

A final comment - the fpsync code downloaded (see link above) and put in the .sh script file did NOT work on resume. I got the following output:

[/mnt/SeagateNTFS]$ ~/bin/fpsync.sh -r 1700947631-185847
/home/guyst/bin/fpsync.sh: 952: arithmetic expression: expecting primary: "             0 +          "
/home/guyst/bin/fpsync.sh: 969: arithmetic expression: expecting primary: "             0 +          "
/home/guyst/bin/fpsync.sh: 799: arithmetic expression: expecting primary: "                  +              "

I know this code is not released, but figured you might want to know this in case it's not something you've seen before. All I did was use the copy option at the top right of that linked page to copy the code, then past into a text editor (SublimeText), then saved, chmodded it to run it.

The resume worked with the repository installed fpsync and is running as I type maxxing at about 83Mb/sec

Since I've got your attention -thanks for an excellent tool! I was a bit anxious at this failing drive and how to get 4.5Tb of data off of it in a reasonable amount of time. Your tool is the answer (without using parallels and rsync).

martymac commented 10 months ago

Hello,

Unfortunately, I cannot reproduce the problem. Have you seen commit a74f1c9f7f56f8fbaa4d1019ab70571b81490e44 which should fix that kind of error when a .meta file is not present ?

Also, as fpsync evolves, it may not be able to resume a run generated with a previous version of itself. Could you try to generate and resume a run with a new (the same) version of fpsync ?

Finally, fpart and fpsync are developped together : new versions of fpsync need new options provided by fpart. So, please, do not use a new version of fpsync with an old version of fpart, that may not work well. If you want to use the repo version, you should recompile fpart to use it with fpsync.

Best regards,

Ganael.