Closed loh-tar closed 6 years ago
Traditionally, a daemon is mirrored by a "client", so cpj
would then be cpc
.
-R
is useless with -f FILE-WITH-LIST
.
My proposal in #2 was that allowing for an input file list allows for de-coupling the scheduler (that is, the daemon) and the copy tool (here cp).
Basically the user specifies the program (default to cp), its options (default to whatever you think is sensible) and arbitrary positional arguments which are passed as is to the copy program, so cpc
does not need to know where the SOURCE and DEST are positioned.
The resulting call is
PROGRAM OPTIONS ARGS
Examples:
$ cpc -p2 SOURCE DEST
# Run `cp SOURCE DEST` with priority 2.
$ cpc -program mv SOURCE DEST
# Run `mv SOURCE DEST`
$ cpc -ot DEST SOURCES...
# Run `cp -t DEST SOURCES...`
$ cpc -obt DEST SOURCE
# Run `cp -b -t DEST SOURCE
Notice that cpc passes the args to the underlying copy program in the order it received them.
My synopsis proposal:
cpc [-p PRIO][-program PROGRAM][-o OPTIONS] ARGS...
cpc [-p PRIO][-program PROGRAM][-o OPTIONS][-f FILELIST] DEST
cpc [-p PRIO][-program PROGRAM][-o OPTIONS] DEST
cpd COMMAND [ARGS...]
In the third form of cpc
, the file list is read from stdin.
As for the file names with newlines in the file list: a structured file list such as JSON would solve the issue.
jshon
is one such tool that enables shell programs to parse JSON.
Thinking about it, I don't think the -t
option can be used in cp
since then there is no way to make the distinction between SOURCE and DEST, which is important for tracking progress.
Fix for the above suggestion: the positional ARGS must be:
FILES... DEST
i.e. folders are not allowed. We cannot allow folders otherwise there is no way to keep track of the files recursively but by re-implementing all the GNU cp logic.
We are better off leaving this job to tools like find
and use the -f
option.
I just pick one tiny part of a sentence where I think you miss something. For the big rest I need some time to think about.
..so cpc does not need to know where the SOURCE and DEST are positioned.
As long as we only want to start some job you are right, but we have to know at least the DEST. You remember? :-)
Edit: Ah, in the next post you noted this. I didn't see it because of -t confused me and I ignored this
Yes, this is what I meant to fix in the subsequent comment.
Any experiences/suggestions how to modify cpd to reflect the intended split into cpd/cpc (or cpj - standards are fine but somehow I dislike cpc)? There are at least two possibilities to do this:
As I write this, I think the latter is better.
I agree, I like the latter better:
By the way, using file lists and removing the -R/-m options will simplify the code significantly.
Here is what I'm working on. Please make suggestion how to change some naming or simplify the synopsis listing. Looks a little fat.
This is cpd - The copy daemon (v0.1pre6, Nov 2017)
Usage:
cpd [option>...] <command> [<argument>...]
cpd [option>...] newjob <source> <destination>
cpd [option>...] newjob <source>... <dest-dir>
cpd [option>...] newjob -t <dest-dir> <source>...
cpd [option>...] newjob <dest-dir> # Read list of sources from stdin
cpd [option>...] newjob -f <file> <dest-dir>
Main Options:
-s Run only a simulation, copy nothing
-v Be verbose
-q Be almost quiet
New Job Options:
-f <file> Read list of source files from <file>
-m Merge all files in <dest-dir>
-o <cp-opt> Add <cp-opt> to the called cp command
-p <user-prio> Enqueue job with modified default priority. PRIO=6-<user-prio>
-r Copy recursive
Commands:
c, cancel <job-id> Cancel a pending or kill a running job
status Print status of the daemon and job processing
start Start the daemon and job processing
pause Pause the daemon and job processing
h, help [c] Show less help or when c=l License, c=s Source of cpd
H Show this help
l, list [c] List jobs, c=i by ID, or errors c=e, or the job log c=l
run Process jobs or trigger daemon to continue
p, prio <job-id> <prio> Change job priority 3-7
r, resume <job-id> Resume a job
s, stop <job-id> Stop a job
tidy Tidy up all job data
newjob <arguments> Enqueue a new job with arguments as shown above
Notes:
• Calling cpd with any other name than cpd forces to run as if newjob was given
• The <command> is lazy recognized by any parts of its long name
• The job priority is not static. Lower numbers have a higher priority
Jobs started in order of priority and enqueue time. 0-2(active), 8+9(old) are intern used
New jobs have prio 6 unless -p was given
• pause stops job processing, if so, or stops the daemon at all, if not
Adding a new job trigger the daemon to continue job processing, there is no blocking
• tidy does nothing else than 'rm -rf /tmp/cpd/user-lot'
• -R,-m are ignored when -o is present
• Enclose <cp-opt> in quotes if you need more than one option
Examples:
Enqueue new copy task, assumed you have a symlink cpc->cpd
cpc -p1 /media/1a/foo * # New job with higher PRIO=5
cpc -p-1 /media/1b/foo * # New job with lower PRIO=7
cpc /media/1c/foo < /path/to/list-of-files
find * -type f -print | cpc -p2 /media/1d/foo
Take a look how it is going
watch -n1 cpd list
watch -n1 cpd status
Edit: Fix double use of stop Edit2: Simplify synopsis block, but still pudgy
I like where this is going! Much more structured, consistent, simple... Good job!
First suggestions:
Replace H
by h a
(all).
But I'm not sure there should be such an option in the first place: why not printing it all directly, it's not so long anyway (and we can work on making it shorter).
Remove start/pause/run: instead, add the possibility to select multiple jobs in cancel
, resume
and stop
can manipulate multiple jobs at once. Syntax could be
What's the exact purpose of 'tidy'? How is it different from cancel
?
I thought there was no more "priority" on the user side? I liked your idea of choosing where to insert jobs in in the job queue.
Do you really want to put a recursive -r
job option in? This is an open door to tons of issues.
Symlink, cross-device folders, special devices, access rights...
Random thought: what about having an option (e.g. as an environment variable) of a "finder" to run to find files? For instance, the default command could be find <root> -inames '*<pattern*'
and the user would be free to tweak options like symlinks, depth, etc.
Likewise, the -m
option is a tough one. I don't have any suggestion for the moment.
If print the status in the list
command, then you can remove the status
command.
Edit: Finished job selection syntax
Good job!
After so much grumbling I really appreciate this, thanks!
1) ..why not printing it all directly
I like it. I use this with my other projects too. This way the experienced user can quick take a look a some point and the new one becomes more infos.
2) Remove start/pause/run
What? run is old -P, start old -D+, pause old -D-
2) add the possibility to select multiple jobs
Guess like cpc stop 1 3 5 (all 3 jobs stopped). Yes, I mentioned this way somewhere. No problem
3) ...tidy
Äh? Did you oversee the note? cancel do nothing but set status from pending to canceled or kills a running job.
4) I thought there was no more "priority" on the user side?
Yes. This one is only one of my suggestion from somewhere
5) -r... is an open door to tons of issues...
Ehm, well, simple ignoring? :-) I think of it like a "quick and easy" mode.
5) ...environment variable of a "finder" ...
Puh(?) Perhaps. Perhaps not. I don't know
7) print the status in the list command
Yeah, I thought too in that direction for a while. It was initially taken from your Arch post
What? run is old -P, start old -D+, pause old -D-
Yes, but is this useful at all? If we have multiple selection, cpd resume -
is the same as cpd run
and cpd start
, isn't it?
Same comment for tidy
: isn't it the same as cpd cancel -
Ehm, well, simple ignoring? :-) I think of it like a "quick and easy" mode.
I don't think that's the kind of issues you can ignore.
is the same as cpd run and cpd start, isn't it?
No. run process the jobs in the forgeround, start start the daemon who do almost the same, sure.
resume only resume a previous stopped job
tidy cleans up the tmpDir which is not cleared automatically
cancel ..I can't explain in better. Please read again previous post
I don't think that's the kind of issues you can ignore.
When the new job is build they are some checks done, like is arg is directory. So, with ignoring I mean when the arg is not a file it is not added to the list of to copied files. How to handle symlinks I am not sure, right now they are followed (I think is to term) and copied. Crossing file system is the same. And I think that's how it should work.
My point is that all this "almost the same" can be simplified greatly, can't it?
"start", "run", "resume" can all be merged into one "start": if the daemon is not started, start it. If a job is provided, start it. If the job is paused, resume it. If several/all jobs are provided, start them. There is no ambiguity so you can effectively use only one command for all this.
Same for stop/pause/cancel.
tidy cleans up the tmpDir which is not cleared automatically
Why isn't it automatic?
How to handle symlinks I am not sure, right now they are followed
Do you make any cycle detection? If not, there is your first (out of many) problems: the daemon will hang forever.
The need for all these commands are for me clear, but with the new names I am not lucky.
Why isn't it automatic?
Then are all logs gone
Do you make any cycle detection?
What? No. Don't think so. There is running a find
the daemon will hang forever.
No, is done by add new job :-) However, should to be avoided.
Regarding your merges: Each are complete different tasks. Sure you can auto start/stop the daemon and some so on, but the functionality is needed anyway.
So you suggest to hide it? What is the benefit? A shorter help text. But less user access.
What? No. Don't think so. There is running a find
Do you mean you run the find
command to traverse directories? Then cycle-detection is done for you and everything is fine.
No, is done by add new job :-) However, should to be avoided.
I did not understand this.
So you suggest to hide it? What is the benefit? A shorter help text. But less user access.
Ask yourself the opposite question: what's the opposite of starting the daemon without doing anything? It's a user daemon.
A command to terminate the daemon can be left to pkill cpd
.
If resume
can resume multiple jobs, possibly all jobs, then there is no need for run
, right?
User control remains unchanged.
> No, is done by add new job I did not understand this.
The daemon read the data written by add new job, therefore would add new job hang
A command to terminate the daemon can be left to pkill cpd.
With a similar argumentation could you request to remove at least tidy, stop, resume and so on. I don't think that would be useful
To resume a job he must be stop ped before. run does a complete different task. stop send some signal to a running process, resume too, but a different signal. I'm sure you know without to search what these signals are, I need too look
OK, I guess you only request to get rid of the need to start/stop the daemon. As said, could be easy be done by some setting. For my sake as optional out.
How about a config file to get rid of some options needed by every use? I talk about an option like -a to auto start the daemon when add a new job or -q to not to be nerved by to much verbosity. But, If you now say yes I'm not sure if I do it for a 1.0 release
Damn, wrong button. Where is the "re-open"? Ahh, there!
Because I like my own -R/-m options and -t can be simulated I give this synopsis to discussion. Note the two different main calls: cpd and the new cpj, j for job.