scalingexcellence / scrapybook

Scrapy Book Code
http://scrapybook.com/
480 stars 208 forks source link

vagrant up --no-parallel fails at "Bringing machine dev up" on Ubuntu 14.04.4 LTS #8

Closed mrem01 closed 8 years ago

mrem01 commented 8 years ago

Hi - I've followed Learning Scrapy's instructions in the appendix for Ubuntu 14.04.4 LTS, without success.

    Bringing machine 'web' up with 'virtualbox' provider...
    Bringing machine 'spark' up with 'virtualbox' provider...
    Bringing machine 'es' up with 'virtualbox' provider...
    Bringing machine 'redis' up with 'virtualbox' provider...
    Bringing machine 'mysql' up with 'virtualbox' provider...
    Bringing machine 'scrapyd1' up with 'virtualbox' provider...
    Bringing machine 'scrapyd2' up with 'virtualbox' provider...
    Bringing machine 'scrapyd3' up with 'virtualbox' provider...
    Bringing machine 'dev' up with 'virtualbox' provider...
    There are errors in the configuration of this machine. Please fix
    the following errors and try again:

    vm:
    * A box must be specified.

My question is: how can I specify the scrapybook box?

Thanks!

Paul.

lookfwd commented 8 years ago

Hello Paul,

Thanks a lot for the question. It gives me the opportunity to clarify a few things on the Ubuntu 14.04.4 LTS process. When somebody has Ubuntu 14.04.4 LTS the most efficient way to set up the system is by using docker (without the Virtualbox overhead). It's extra efficient since all those "virtual machines" run as "native" processes in the system without incurring larger virtualisation overheads.

Follow the Appendix A process to set those up for Ubuntu. Don't forget to logout and re-login at the end of this process. This is important because otherwise you won't be able to run docker without sudo. When you re-login you should be able to check that docker runs fine:

docker run hello-world

One more time (cause it got me twice); the above command must run nicely without sudo. My book has installation instructions to get to this point which are valid right now, but possibly at some point in the future they might become invalid. Thus always cross-check the latest docker installation instructions here.

Now, a thing that I must admit is not very explicit on the book is that if one installs Vagrant with sudo apt-get install vagrant will get a version that is rather old (1.4.3 at the moment). Vagrant moves fast and there are significant bug fixes added in-between. So please, install or upgrade Vagrant to the latest available version (1.8.1 at the moment) with something along the lines of:

wget https://releases.hashicorp.com/vagrant/1.8.1/vagrant_1.8.1_x86_64.deb
sudo dpkg -i vagrant_1.8.1_x86_64.deb

You can find the latest process and urls here. If you don't have docker installed or if Vagrant is old, you might get errors like this:

The executable 'docker' Vagrant is trying to run was not
found in the PATH variable. This is an error. Please verify
this software is installed and on the path.

If you have docker and Vagrant installed, at this point, you should be able to do the usual process:

git clone https://github.com/scalingexcellence/scrapybook.git
cd scrapybook
vagrant up --no-parallel

The system should be up and running after some time (a bit more the first time because it downloads docker images). The process above is 100% Vagrant - Docker based and works nicely, is very efficient and highly recommended for Ubuntu 14.04.4 LTS.

This should be enough to run book's system and this is where the answer really finishes. The rest of the material is just for reference.


Some Reference Material

Of course one can use VirtualBox under Ubuntu. One thing to be aware of is that Linux that already runs inside a Virtual Machines (e.g. the ones one gets from Amazon AWS EC2) might not have virtualization extensions enabled. As per #5, I'm not willing to support in great extend such systems but I provide some pointers and workarounds there. So if you are on AWS/EC2, prefer docker.

I will assume from now on that you run on a machine that has virtualization extensions enabled and you want to run the usual Virtualbox flow. First of all you will have to install Virtualbox as described here and then Vagrant as described previously. As mentioned in issue 5 here there's no further need to explicitly download and install scrapybook.box. This is great and simplifies the process. (If you did so, it wouldn't really hurt but keep in mind that you would have to change config.vm.box = "lookfwd/scrapybook" to config.vm.box = "scrapybook" in Vagrantfile.dockerhost.)

So let's assume that you take the easy path and you've just downloaded/installed Virtualbox and Vagrant on Ubuntu 14.04.4 LTS. All you have to do then is set an environment variable:

export SCRAPYBOOK_FORCE_HOST_VM=TRUE

and then the typical:

git clone https://github.com/scalingexcellence/scrapybook.git
cd scrapybook
vagrant up --no-parallel

The --provider=virtualbox won't work unfortunately because it tries to treat server's definitions as Virtualbox images instead of docker images. config.vm.box is necessary for Virtualbox images and optional and meaningless for docker images, thus the very confusing error message:

There are errors in the configuration of this machine. Please fix
the following errors and try again:

vm:
* A box must be specified.

All those extra comments and processes are just for reference. The only thing that you really need for Ubuntu 14.04.4 LTS is latest docker + Vagrant as described on the beginning of this answer. This is the most efficient and easy way for Ubuntu.

yssoe commented 8 years ago

Hello, maybe something stupid but I think you forgot to cd into the scrapybook directory before running the vagrant up command.

Cheers

mrem01 commented 8 years ago

Thank you for the thorough answer.

Getting the latest vagrant did the trick!

wget https://releases.hashicorp.com/vagrant/1.8.1/vagrant_1.8.1_x86_64.deb
sudo dpkg -i vagrant_1.8.1_x86_64.deb
lookfwd commented 8 years ago

Awesome!