nchammas / flintrock

A command-line tool for launching Apache Spark clusters.
Apache License 2.0
638 stars 116 forks source link

Add a (default?) option to run-command to see the output #135

Open sylvinus opened 8 years ago

sylvinus commented 8 years ago

I think run-command could be even more useful if the user was able to see the output of the command on each instance.

This could be added as a flag to the existing command, and I think it could also be argued that it should be the default, but I'll obviously leave that to the maintainer ;)

nchammas commented 8 years ago

I thought about this when I first implemented the run-command feature, but decided against adding it because I didn't want the feature to grow into a half-baked pssh.

The intended use of run-command is to install stuff on the cluster, so generally you would only care to see the output if something went wrong (which you currently do).

Of course, my intended use case is not necessarily everybody else's. 😄 Perhaps if there is enough demand for more flexibility we can expand the functionality of run-command, but I'd like to avoid it if possible.

What's your use case for wanting to see output?

sylvinus commented 8 years ago

Sure!

I've seen that the output is indeed displayed in case of error, which is very handy.

However not all cases where something goes slightly wrong can return an error code, I just had a bug where some optional dependencies were not present at compile time and so my library was missing a feature. That would have been obvious from the make log but it took me a while to debug because I was blind.

I think this feature is probably not that important when you have your whole workflow worked out, but when you're setting things up for the first time or adding a new dependency, it would be really useful. Usually, those cases are also when you just have 1 slave to test on, so the output wouldn't be that hard to read.

nchammas commented 8 years ago

I think this feature is probably not that important when you have your whole workflow worked out

Yep, I think this -- for me at least -- makes the feature request a low priority, and probably not something I'd want on by default. But I see the value in having it available.

Some thought would have to go in to how to display the output (we can follow the current pattern of displaying it from any one node), and I'm starting to think some of Flintrock's internals need to be refactored to use some library that can do more of this work for us out of the box, like parallel-ssh.

concretevitamin commented 6 years ago

I'd say this is a must-have, not "not that important". I was just debugging a HDFS issue and would like to see the free space on all of the nodes. To my surprise I can't use run-command 'df -h' to see the output. For now I can resort to parallel-ssh.