Closed mgedmin closed 5 years ago
Seems reasonable. I think the relevant line is:
d-i debian-installer/locale string en_US
Which should change to what?
d-i debian-installer/locale string en_US.UTF-8
Which is what I see on 18.04. Then the question becomes which Debian/Ubuntu versions should this change be made to? I'd hate to apply the update to every Ubuntu/Debian config, only to have a bunch of the boxes fail during the next run.
Ubuntu has used UTF-8 by default nearly since the very beginning: https://wiki.ubuntu.com/UTFEightByDefault mentions this being a release goal for Hoary Hedgehog aka Ubuntu 5.04, released in 2005, the second ever Ubuntu release.
I know UTF-8 support has been around for awhile. What I'm asking is whether the change above results in setting the collation to en_US, but the character set to UTF-8, and thus fixes your issue.
And if this is the fix, will all of the different installers accept it, given that it appears on the surface to use a different format, specifically it goes from COLLATION
to COLLATION.CHARSET
with a period between them.
I can't make the change without testing first. Any chance you can run a test?
The workaround I applied in the mean time was rewriting /etc/default/locale in my Vagrant provisioning scrpts to contain LANG="en_US.UTF-8"
. For Ubuntu's PostgreSQL specifically this needs to be done before apt installing the postgresql-server package (because that package's install script creates a cluster configuration that remembers the system locale used during creation, for I don't know what reason -- I suppose the on-disk data structures assume a particular collation order or something).
I'm not entirely sure I understand what you mean by COLLATION
-- the strings passed to debian-installer/locale
are glibc locale names. The format for them (documented in the setlocale manual page) is language[_TERRITORY][.charset][@modifier]
with the territory, charset, and modifier parts being optional. You can run locale -a
to see a list of locale names supported by the system. (The charset part is passed through a normalization step, so that both UTF-8 and utf8 mean the same thing. Modifiers were used by things like adding the Euro symbol into an otherwise Latin-1 locale in the bad old days when 8-bit charsets reigned supreme.)
So, I'm reasonably certain that changing en_US
to en_US.UTF-8
would work for all ubuntu and debian boxes, but I agree that it would be irresponsible to push new images without testing. I'd appreciate some help with that: I read the README, ran res/providers/packer.sh
, and it failed for me:
# github.com/hashicorp/packer/common/net
../../hashicorp/packer/common/net/configure_port.go:61:2: undefined: net.ListenConfig
error: pathspec 'except_post_processor_tests' did not match any file(s) known to git
res/providers/packer.sh: line 50: scripts/build.sh: No such file or directory
The failure confuses me because ~/go/src/github.com/hashicorp/packer/scripts/build.sh most definitely exists. It fails, though:
$ cd ~/go/src/github.com/hashicorp/packer
$ export GOPATH=$HOME/go/
$ PATH=$GOPATH/bin:$PATH
$ XC_ARCH=amd64 XC_OS="windows darwin linux" scripts/build.sh
3 errors occurred:
--> linux/amd64 error: exit status 2
Stderr: # github.com/hashicorp/packer/common/net
common/net/configure_port.go:61:2: undefined: net.ListenConfig
--> darwin/amd64 error: exit status 2
Stderr: # github.com/hashicorp/packer/common/net
common/net/configure_port.go:61:2: undefined: net.ListenConfig
--> windows/amd64 error: exit status 2
Stderr: # github.com/hashicorp/packer/common/net
common/net/configure_port.go:61:2: undefined: net.ListenConfig
==> Copying binaries for this platform...
find: ‘./pkg/linux_amd64’: Toks failas ar aplankas neegzistuoja
==> Results:
viso 0
Looks like an issue with the packer
build script. If I were to guess, it's because you're missing a required golang dependency or have an incompatible version of go. You could open an issue with packer
if you're interested.
That said, at the moment, you shouldn't need any patches to build the Robox config files, so the current released version 1.4.2 should work just fine. Just grab it here.
If you want to verify things before kicking it off, you can run ./robox.sh validate
which will validate that the packer
version accepts the current JSON. You can also run ./robox.sh links
to make sure all the ISO URLs are still good. In this case you can ignore any non-Debian/non-Ubuntu problems.
Once you have packer setup you can edit the auto install files in the http
directory, and run ./robox.sh box generic-BOX-PROVIDER
... for a specific config, or to build all the boxes for the Debian and Ubuntu configs, you can run the command:
./robox.sh box generic-debian8-libvirt,generic-debian9-libvirt,generic-debian10-libvirt,generic-ubuntu1604-libvirt,generic-ubuntu1610-libvirt,generic-ubuntu1704-libvirt,generic-ubuntu1710-libvirt,generic-ubuntu1804-libvirt,generic-ubuntu1810-libvirt,generic-ubuntu1904-libvirt
Which will build the libvirt
version of all the relevant boxes. For this type of change, one provider should be sufficient, so feel free to swap libvirt
for virtualbox
or vmware
as appropriate. Note you cannot run libvirt
and virtualbox
at the same time.
The command above will build the boxes one at a time to avoid having any fail due to net,cpu,disk load. Depending on your system, ie notebook, vs desktop vs server, SSDs, RAM, etc you can probably break up the list of boxes, and run jobs in parallel. My 4 year old workstation class Thinkpad (W540) can handle 2-4 builds at the same time. A desktop/server with SSDs, gigabit, could probably handle significantly more.
P.S. I don't use Postgres much, but for my MariaDB/MySQL config scripts, I usually generate/install my own /etc/my.cnf
file (path changes as appropriate), and you can set the default collation/character set via that file.
As for collation, it refers to how the system does character comparisons, aka upper case, vs lower case vs straight binary comparisons. And it becomes important when switch to UTF-8. I believe the locale dictates how the OS handles this, but it's far more important to get right in your database.
@mgedmin I should also mention, that with MariaDB/MySQL you can also set a default collation/charset for a database schema and/or table, which is what I do for magma
... like so.
FWIW
$ ./robox.sh links
...
Link Failure: https://mirror.leaseweb.net/ubuntu-cdimage/releases/18.10/release/ubuntu-18.10-server-amd64.iso
because Ubuntu 18.10 is End of Life, I suppose. (There are also alpine and a couple of FreeBSD failures.)
./robot box
fails for me because
Build 'generic-ubuntu1904-libvirt' errored: Failed creating Qemu driver: exec: "/usr/libexec/qemu-kvm": stat /usr/libexec/qemu-kvm: no such file or directory
My laptop has Ubuntu 19.04, which does not have /usr/libexec/qemu-kvm. There is a /usr/bin/kvm. libvirt itself works fine -- I've switched to generic/ubuntuNNNN boxes just so that I could use vagrant-libvirt instead of virtualbox.
Should I edit generic-libvirt.json and try again? (Trick question: I tried it already, and now I'm looking at the script downloading debian ISO files.)
Ha ha I forgot to actually change the locale strings in http/generic*.cfg before kicking off the box builds. Restarting.
What sort of tests would you like me to perform on the built .box files?
Ouch, the box names must be separated with commas, not spaces, so I cannot do things like
./robox.sh box generic-debian{8,9,10}-libvirt
:(
Why does the ./robox.sh script exit with a non-zero status code after successfully building a couple of boxes?
What are the differences between output/generic-*.box and the corresponding output/roboxes-*.box?
Lots of questions. Lots of answers, mostly.
I updated the 18.10 URL yesterday. Or at least I thought I did. I just realized my find/replace missed the URL in the generic-virtualbox.json
file.
I'll need to keep eye out, as the cosmic
packages will disappear from the mirrors soon, and that will require a similar tweak to the installer file config. I'm actually thinking about adding a check for that to the ./robox.sh links
function. It will be pain though, because I won't be able to generate the list of URLs to check dynamically.
As of this second, all of the URLs are working. But if you keep helping out, know that the Gentoo URL breaks often (sometimes daily, as it gets rebuilt automatically), and the Arch URL changes monthly. If you need to find/update those URLs, the ./robox.sh isos
will supply the correct URL/SHA values. At some point I might figure out how to use jq
to update the JSON, but for now it's a manual find/replace.
As for the qemu-kvm
question, that is the path to qemu-kvm
on CentOS. Naturally it's different on Ubuntu. Just update the path in the JSON file to match the location on your system.
As for tests, I don't have a good answer. The biggest test is whether or not it goes through the install and all the config modules without throwing any (unexpected) errors. Having an auto-install config hang is a pretty big deal, as packer
will wait 1 to 4 hours (depending on platform/box) before it times out, and moves onto the next box. So if installer doesn't like the value for all of the Ubuntu boxes that adds (at a minimum 86 hours to the build). Having a config script fail later on is also bad, because it means I'll need to retry that box config, and if it fails a second time, build it locally and troubleshoot. A painful process.
Naturally making sure the resulting box file will also work with vagrant up && vagrant ssh
is a critical test. That's at a minimum. Beyond those things, I don't have any good answers, as I mostly use latin1
, so I can't suggest any additional test cases you can do with latin1
vs utf8
. Naturally seeing if it fixes your Postgres issue is worth checking, but that is an isolated issue. I actually have a set of test scripts that I use pull down the boxes and do the above, plus run a few commands, but the process is primitive, and I haven't been able to work on it since I started my trip in May. The internet is too slow to pull down all the box files and test. My Jenkins server is virtualized which makes testing the images difficult. I'm hoping to have access to a Jenkins server with physical nodes soon, so I can test more regularly. My test cases right now are:
# (testcase vagrant upload Vagrantfile) &>> $1-$2-$3.txt; error $1 $2 $3
# # (testcase vagrant upload .vagrant) &>> $1-$2-$3.txt; error $1 $2 $3
# (testcase vagrant ssh -- exit 0) &>> $1-$2-$3.txt; error $1 $2 $3
# (testcase vagrant ssh -- echo "\$SHELL" | grep -q bash) &>> $1-$2-$3.txt; error $1 $2 $3
# # (testcase vagrant ssh --command "if [ ! -f Vagrantfile ] || [ ! -d .vagrant ]; then exit 1; fi") &>> $1-$2-$3.txt; error $1 $2 $3
# (testcase vagrant ssh -- "if [ ! -f Vagrantfile ]; then exit 1; fi") &>> $1-$2-$3.txt; error $1 $2 $3
# (testcase vagrant ssh -- "ping -c 4 lavabit.com") &>> $1-$2-$3.txt; error $1 $2 $3
# (testcase vagrant ssh -- "curl --silent --user-agent \"Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0\" --output /dev/null --url https://lavabit.com") &>> $1-$2-$3.txt; error $1 $2 $3
# (testcase vagrant ssh -- "sudo -- touch /test.dat && sudo touch /etc/test.dat && sudo -- bash -c 'echo TestOption no >> /etc/ssh/sshd_config'") &>> $1-$2-$3.txt; error $1 $2 $3
# (testcase vagrant ssh -- "(which grep && which curl && which cat && which date && which ping && which awk && which sed && which ssh && which man && which ps && which vim) > /dev/null; exit \$?") &>> $1-$2-$3.txt; error $1 $2 $3
The arguments are ORG BOX PROVIDER.
The robox.sh
script started as a simple tool to config the environment variables, and run packer
against all the growing number of JSON files. It's grown into massive bash
script since then, and I have several things I'd like to add. Namely integrating my standalone upload/release/verify/add scripts as functions in robox.sh
, along with the box testing logic I currently have in a separate project. I'll also add logic that will auto-retry failed builds when running one of the meta functions (generic/magma/lineage or vmware/libvirt/parallels/hyperv/docker/virtualbox or some combination of the two) at some point. But I tend to only add features when doing things manually gets painful enough, and I manage to find the time.
Globbing support right now feels like more effort than it's worth. But with that said, the commas are only needed to easily parse the list into a bash
array. If your so inclined you could update the box()
function to support both methods and submit a pull request.
Personally, I like to run ./robox.sh generic-virtualbox
(or ./robox.sh generic
or ./robox.sh all
) to build machines, so I only need the box target to rebuild failures. What I do is run ./robox.sh missing | grep "Box - "
which gives me a list I can massage via a text editor.
I'm not sure why robox.sh
isn't returning a proper status code. I think I fixed this issue once upon a time, but ended up switching it back because I think it broke something else. I don't recall precisely what, but I believe the issue is with how bash
forks or doesn't fork, but your right, it should return the right status code.
If you look at the box()
you'll see that it basically runs the box list against all the different JSON files. The issue could be that it's running the config against a JSON file with no matching box names, and attempt happens last, so the error code is what bubbles up. Just a guess though.
As for generic
vs robox
they are virtually identical. When I started building these images ~3 years ago, they were all called "generic" ... which is why the generic
images have more downloads. When I finally put everything on GitHub (I had to sanitize my initially internal repo before I could make it public), I needed a unique moniker, because I couldn't call the repo packer
anymore. I came up with Robot Boxes or roboxes for short. Eventually I started releasing the images under both names. On other platforms, namely Docker Hub, the images are only in the roboxes
namespace (I couldn't register generic
).
Initially it packer
built each config twice, which got very painful. I finally spent a few hours and figured out how I could write out the same artifact under two different names, and that is what we have today.
Thank you for all the answers!
Here are the changes I've tested: https://github.com/mgedmin/robox/commits/mg. Three commits:
I tested box builds for all the boxes I touched, using the libvirt backed. All builds completed successfully. (Except the one where I connected a vnc viewer to the printed vnc:// URL out of curiosity and discovered that the build scripts also want a vnc connection to navigate the boot menus, and, well, when you connect a new vnc client, the old one gets kicked out. The build succeeded when I retried it.)
I haven't tested whether vagrant up works yet because I don't remember how to do that when you have a box file on disk rather than a name to fetch from Vagrant Cloud. (I will look it up.)
I think I see where the exit code comes from -- the last command in the box() function is
[[ condition ]] && do something
and when condition is false, well, that leaves a stale exit code. I would suggest adding a return 0
at the very end of the box() function but yeah, if you want the exit status to indicate whether there were any errors, that'd be a more involved process.
one to make kvm work on Ubuntu
one to make Ubuntu 18.10 not fail image downloads
one to change the locale to a UTF-8 one for all Debian and Ubuntu boxes
Can you submit a PR with just the locale change? The KVM change doesn't apply to CentOS, and the URL update is already done.
I haven't tested whether vagrant up works yet because I don't remember how to do that when you have a box file on disk rather than a name to fetch from Vagrant Cloud. (I will look it up.)
vagrant box add PATH
Then vagrant init NAME && vagrant up && vagrant ssh
and it should use
the local file and not the cloud version. You might need to import it
with a higher version number. If it's older than the cloud it might ask
whether you want to auto update.
and when condition is false, well, that leaves a stale exit code. I would suggest adding a |return 0| at the very end of the box() function but yeah, if you want the exit status to indicate whether there were any errors, that'd be a more involved process.
Done. I suppose always getting 0 is better than a useless error code.
I've discovered that
/etc/default/locale
on generic/ubuntu1804 containsThis causes problems, e.g. I cannot create PostgreSQL databases using UTF-8 because the system locale uses Latin-1.
(Some of the problems are masked when you use
vagrant ssh
because SSH copies LANG and LC_* from your host system.)I believe stock Ubuntu defaults to UTF-8 locales and I think the generic boxes should too.