sequenceiq / docker-spark

Apache License 2.0
765 stars 282 forks source link

Can we get the latest version of Spark (1.2.1) on Ubuntu? #14

Closed nealmcb closed 9 years ago

nealmcb commented 9 years ago

And maybe do it regularly (cf. #12 ) Thanks again for the great images!

matyix commented 9 years ago

Sorry for being late with this - have done it for CentOS few weeks ago ... from now on will update both at the same time when a new Spark release comes out.

nealmcb commented 9 years ago

Thanks for the quick action! But I can't get it to work. E.g. I get this:

docker run -i -t sequenceiq/spark:1.2.1-ubuntu bash
exec: "bash": executable file not found in $PATH

But this works fine:

docker run -i -t sequenceiq/spark:1.2.1 bash
matyix commented 9 years ago

Was working for me. Just dropped my image, pulled from Docker.io and it does work. What is your environment? Did the 1.2.0 version worked (and still working)? There are no other changes but the updated Spark version.

nealmcb commented 9 years ago

I'm very puzzled. 1.2.0-ubuntu works fine for me. If i do this to try to clean out the old one and pull a new one

docker rmi -f sequenceiq/spark:1.2.1-ubuntu

it says it untagged it, but subsequent pulls still say everything is up to date. I guess that means I have to clean out more images but that seems excessive. Is there a way to get a minimal set of stuff to rmi so I can re-pull as little as possible?

I just discovered the extra branches of the github repository for each version and os.

When I build it myself it starts up fine. Though I note that the commit at ensure the bootstrap will be executed anyway · 63a67af · sequenceiq/docker-spark doesn't seem to be in the ubuntu image in the v1.2.1onHadoop2.6.0-ubuntu branch.

Is there a way to get the exact Dockerfile or git commit used to build a given image?

nealmcb commented 9 years ago

Ahh - yes - my environment is Docker 1.4.1 on Ubuntu Trusty 14.04.2 64bit, on a btrfs file system.

And I just found the docker history command which is a bit helpful in terms of where an image came from.

nealmcb commented 9 years ago

Part of the confusion has to do with changes in CMD and ENTRYPOINT and associated documentation.
I have found the practice of putting a dash in front of shell commands in much of the Hadoop ecosystem, e.g. hdfs dfs -cat, bizarre and confusing, compounded by the non-standard way that many Java commands use a single dash rather than the common getopt_long convention of using a double dash for long options. So I'm glad I don't need a dash in front of bash thanks to your January 14 commit ensure the bootstrap will be executed anyway. But I didn't notice that at first, and still get confused....

The command I listed above said to just run "bash", which did work in 1.2.0-ubuntu. But now that fails as noted above with 1.2.1-ubuntu. I'm guessing that you tried it with a different command line.

This command, straight from the README.md, gives a different error here:

docker run -i -t -h sandbox sequenceiq/spark:1.2.1-ubuntu /etc/bootstrap.sh -bash
no such file or directoryFATA[0001] Error response from daemon: Cannot start container df4f00c8cfe02c11ba3617d97885f99b494c98266dfb3efc76bfc7b368426ea9: no such file or directory

And as noted above, when I rebuild the image from v1.2.1onHadoop2.6.0-ubuntu, that command line works fine.

Exactly what command line are you using to test it?