shenwei356 / rush

A cross-platform command-line tool for executing jobs in parallel
https://github.com/shenwei356/rush
MIT License
847 stars 62 forks source link

TODO #1

Closed shenwei356 closed 5 years ago

shenwei356 commented 7 years ago
mattn commented 7 years ago

please add automatic detection for using shell or not-use. (like this https://github.com/mmstick/parallel/blob/0dd48100e9a29d9a023826c778a5c7e70f9bf464/src/execute/exec_inputs.rs#L40-L45)

shenwei356 commented 7 years ago

please add automatic detection for using shell or not-use.

OK. I'll use mattn/go-shellwords

mattn commented 7 years ago

go-shellwords doesn't detect multiple commands like foo; bar, Sorry. BTW I'm guessing why go is faster than rust in this result is whether shell is spawned.

https://www.reddit.com/r/rust/comments/5penft/parallelizing_enjarify_in_go_and_rust/dcr4y7f/

shenwei356 commented 7 years ago

I think running all commands using shell ($SHELL -c for *nix and %COMSPEC% /c for Windows) for both single command and multiple commands like foo; bar is fine.

mattn commented 7 years ago

What I mean is Why rust is faster always. :) If rush can avoid to spawn shell, rush will be faster, I guess.

shenwei356 commented 7 years ago

I get it. Thanks you.

mmstick commented 7 years ago

@mattn Running commands within a shell has very little overhead for my Rust implementation when you follow the recommendation to install dash. Here's a comparison of times with and without the shell:

Without Shell

seq 1 10000 | time -v target/x86_64-unknown-linux-musl/release/parallel 'echo {}' > /dev/null

User time (seconds): 0.40 System time (seconds): 2.68 Percent of CPU this job got: 93% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.29

5489.640372 task-clock:u (msec)

With Shell

These are times when the shell is enabled (with dash-static-musl installed)

seq 1 10000 | time -v target/x86_64-unknown-linux-musl/release/parallel 'echo {}; echo {}' > /dev/null

User time (seconds): 0.35 System time (seconds): 2.56 Percent of CPU this job got: 128% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:02.27

4593.366103 task-clock:u (msec)

Believe it or not, but the shell path with dash is actually much faster than the no-shell path. That is something that I will be investigating, to see where my bottleneck is in regards to the no-shell codepath.

shenwei356 commented 7 years ago

@mmstick The rust implementation is indeed faster for this test. And the go API for running a process needs to call $SHELL -c, so I did not compare case without using shell.

What made me confused was why rush_linux_amd64 had a bad performance in your two computers. In my laptop, for the test seq 1 10000 | time -v $CMD 'echo {}' > /dev/null, rust-parallel has ~4X speed of rush but was >100X faster in your computers.

Here's a fresh result:

$ for cmd in parallel rust-parallel rush; do echo $cmd; seq 1 10000 | time -v $cmd 'echo {}' > /dev/null; done
parallel
        Command being timed: "parallel echo {}"
        User time (seconds): 28.73
        System time (seconds): 30.66
        Percent of CPU this job got: 185%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:32.04

rust-parallel
        Command being timed: "rust-parallel echo {}"
        User time (seconds): 3.13
        System time (seconds): 4.82
        Percent of CPU this job got: 312%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:02.54

rush
        Command being timed: "rush echo {}"
        User time (seconds): 12.81
        System time (seconds): 24.45
        Percent of CPU this job got: 274%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:13.57

Besides, speed is not the #.1 target for rush now, especially for processes that last long. I'm using it every day in my Bioinformatics analysis and try to keep on improving the usability and stability.

mmstick commented 7 years ago

Do you have any AMD hardware? Both of my systems are powered with AMD so that could be one reason. It could also be the Intel CPU governor having issues of not retaining it's max frequency long enough.

Basically, before I perform my benchmarks, I ensure that all software is closed, that the CPU governor is set to performance via sudo cpupower frequency-set -g performance, and that transparent_hugepages is set to madvise via sudo sh -c "echo madvise > /sys/kernel/mm/transparent_hugepage/enabled". The Linux distribution that I am operating from is Arch Linux, and I have dash-static-musl installed because of it's high performance.

mfasold commented 6 years ago

Would it be possible to process a set of commands that is specified in a file, for example like the "::::" argument in GNU parallel?

shenwei356 commented 6 years ago

@mfasold -i file.txt