rexray / rexray

REX-Ray is a container storage orchestration engine enabling persistence for cloud native workloads
http://rexray.io
Apache License 2.0
2.16k stars 327 forks source link

Alpine Linux support #719

Closed lmakarov closed 7 years ago

lmakarov commented 7 years ago

Summary

Add support for Alpine Linux

New Feature

Docker for AWS (Docker for Mac and Windows) provisions nodes using Docker's Moby Linux (Alpine Linux). Adding support for Alpine will make it possible to use REX-Ray with the stock Docker VM images.

akutz commented 7 years ago

Hi @lmakarov,

Could you please provide additional information, such as:

Thank you.

kacole2 commented 7 years ago

@lmakarov checkout this blog from yesterday. Persistent storage in Docker for AWS is not a mystery

in addition, here is everything you need for Docker for AWS in the {code} Labs.

I have a very hacky way of building Alpine images today but this is NOT what should be used going forward. Again, this was just a hack to prove it can be done EBS and EFS Volumes with Docker For AWS using REX-Ray

clintkitson commented 7 years ago

It would be great to get some more details as @akutz described.

Another thing is that there is an active conversation in #project-rexray channel in the community where @lax77 is working through the creation of an alpine based container image for REX.

So there are really two things here.

1) Containerizing REX with Alpine as the base custom image 2) Ensuring REX itself is compatible with Alpine as a base OS when not running REX in the container. I believe the containerization (#1) provides the insight as to what packages need to be installed in this case.

lmakarov commented 7 years ago

@kacole2

checkout this blog from yesterday

Wow, what a great timing, thank you! I wish the blog post and the video showed up in Google search results earlier, as I spent quite some time in the last 2 days trying to figure out how to use REX on Alpine directly or dockerize it (which I thought was abandoned as an idea after this PR was closed https://github.com/codedellemc/rexray/pull/347).

@akutz

How does REX-Ray behave today with Alpine Linux?

There is an number of issues. I figured out some of them, but not all. The [gist](EBS and EFS Volumes with Docker For AWS using REX-Ray) from @kacole2 provides some hints.


UPDATE

See this comment from @akutz on how to install REX-Ray on Alpine without issues https://github.com/codedellemc/rexray/issues/719#issuecomment-279201077


Issue 1: curl complains about SSL certs

~ $ curl -sSL https://dl.bintray.com/emccode/rexray/install | sh -
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). If the default
 bundle file isn't adequate, you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option.

Solution - run install script with INSECURE=1 to suppress curl cert verification

curl -sSL https://dl.bintray.com/emccode/rexray/install | INSECURE=1 sh

Apparently, this is a known issue as the workaround is baked into the install script itself.

Issue 2: rexray installed, but won't start

~ $ sudo rexray version
sudo: unable to execute /usr/bin/rexray: No such file or directory
~ $ ls -la /usr/bin/rexray 
-rwxr-xr-x    1 root     root      42459317 Jan 23 15:14 /usr/bin/rexray

Solution - install the necessary glibc libraries

curl -LO https://github.com/sgerrand/alpine-pkg-glibc/releases/download/2.23-r3/glibc-2.23-r3.apk && sudo apk add --allow-untrusted glibc-2.23-r3.apk

Now the binary is working

~ $ sudo rexray version
REX-Ray
-------
Binary: /usr/bin/rexray
Flavor: client+agent+controller
SemVer: 0.7.0
OsArch: Linux-x86_64
Branch: v0.7.0
Commit: a20a838ca70838a914b632637398824fcb10d0db
Formed: Mon, 23 Jan 2017 15:14:32 UTC

libStorage
----------
SemVer: 0.4.0
OsArch: Linux-x86_64
Branch: v0.7.0
Commit: a1103d3f215117f7b9f51dae2b24f852c9c54995
Formed: Mon, 23 Jan 2017 15:14:12 UTC

Next step - adding a config file

~ $ cat /etc/rexray/config.json 
libstorage:
  service: ebs
ebs:
  accessKey: <aws-key>
  secretKey: <aws-secret>
  region: us-east-1

Issues 3: rexray service won't start

This may not be related to Alpine at all, but rather be a config or AWS permissions issue. In either case I'm currently stuck at this step.

~ $ sudo rexray start -f -l debug
...
ERRO[0000] error starting libStorage server              error.configKey=libstorage.server.services error.obj=<nil> time=1486766826704
ERRO[0000] default module(s) failed to initialize        error.configKey=libstorage.server.services error.obj=<nil> time=1486766826704
ERRO[0000] daemon failed to initialize                   error.configKey=libstorage.server.services error.obj=<nil> time=1486766826705
ERRO[0000] error starting rex-ray                        error.configKey=libstorage.server.services error.obj=<nil> time=1486766826705
...
akutz commented 7 years ago

Referencing what @clintonskitson said earlier, it's not so much multiple issues as it is a cumulative approach to the same problem. For example, whether REX-Ray is running in an Alpine-based container or inside Alpine Linux as a VM, the question is does the container use the minimal rootfs version of Alpine versus standard, vanilla, extended? Perhaps the VM uses the standard VM version which is similar to the Alpine standard version for containers. Even then, as a VM, in what mode is the OS running/installed? Diskless? Data? Sys?

It boils down to the following:

  1. Identifying the required dependencies to support the REX-Ray binary and its transitive dependencies on a given version of Alpine. The required dependency list will be greatest when the OS is Alpine Mini Root Filesystem, with the list reduced as the Alpine version edges closer to a normal OS in terms of baked-in packages.
  2. Ensuring that the libStorage and REX-Ray data directories all exist on writable media. This is quite easily achieved by setting the LIBSTORAGE_HOME and REXRAY_HOME environment variables to point to paths on a filesystem mounted as rw.
    • The REX-Ray binary does not have to exist on a filesystem mounted as rw if, for example, the binary is packaged as part of some container image where the container's binaries are located on a read-only filesystem for purposes of security or stability.

And that's pretty much it. There may be other niggling details such as external dependencies of storage drivers. For example, the S3FS storage driver requires the s3fs binary, and the ScaleIO storage driver requires the SDC toolkit. However, these issues are outside the domain of the core piece of the question. While they are issue that must be solved in order for the associated storage drivers to function, the scope of this issue should be restricted specifically to identifying 1) the required dependency list for supported Alpine distributions and 2) ensuring that REX-Ray and libStorage data directories are configured to exist on a filesystem mounted as rw.

akutz commented 7 years ago

For example, I got REX-Ray running on Alpine Standard with nearly zero hassle. I used the following steps:

# install curl
$ apk add curl

# create the REX-Ray directory on a persistent filesystem
$ mkdir -p /var/opt/rexray/etc/rexray && cd /var/opt/rexray

# download REX-Ray
$ curl -sSLOk https://dl.bintray.com/emccode/rexray/stable/0.7.0/rexray-Linux-x86_64-0.7.0.tar.gz

# decompress REX-Ray and remove the tarball
$ tar xzf rexray-Linux-x86_64-0.7.0.tar.gz && rm rexray-*.tar.gz

# create the lib64 symlink needed by Go binaries
$ mkdir /lib64 && ln -s /lib/libc.musl-x86_64.so.1 /lib64/ld-linux-x86-64.so.2

# create a basic REX-Ray config file
$ cat << EOF > /var/opt/rexray/etc/rexray/rexray.yml
libstorage:
  logging:
    level: debug
  service: vfs
EOF

# start REX-Ray, with a custom data root, as a foreground service
$ REXRAY_HOME=/var/opt/rexray /var/opt/rexray/bin/rexray start -f
akutz commented 7 years ago

Hi @lmakarov,

FWIW, the issue you're facing, I believe, is that you named your config file config.json when it should be config.yml or rexray.yml :)

akutz commented 7 years ago

Hi All,

FYI, there is an error running REX-Ray as a plug-in with Alpine as a base due to TLS certification:

ERRO[0050] s3 connection failed                          error=RequestError: send request failed
caused by: Get https://s3.amazonaws.com/: x509: failed to load system roots and no roots provided region=us-east-1 server=flax-dog-mh service=s3fs storageDriver=s3fs time=1486886076731
ERRO[0050] error starting libStorage server              error=RequestError: send request failed
caused by: Get https://s3.amazonaws.com/: x509: failed to load system roots and no roots provided time=1486886076731
ERRO[0050] default module(s) failed to initialize        error=RequestError: send request failed
caused by: Get https://s3.amazonaws.com/: x509: failed to load system roots and no roots provided time=1486886076731
ERRO[0050] daemon failed to initialize                   error=RequestError: send request failed
caused by: Get https://s3.amazonaws.com/: x509: failed to load system roots and no roots provided time=1486886076731
ERRO[0050] error starting rex-ray                        error=RequestError: send request failed
caused by: Get https://s3.amazonaws.com/: x509: failed to load system roots and no roots provided time=1486886076731
DEBU[0050] completed cli execution                       time=1486886076731
INFO[0050] exiting process                               time=1486886076731
DEBU[0050] completed onExit at end of program            time=1486886076731

I will fix this in an upcoming PR when I switch to Alpine as the basis for REX-Ray Docker plug-in images.

lmakarov commented 7 years ago

@akutz

the issue you're facing, I believe, is that you named your config file config.json when it should be config.yml or rexray.yml

yep, the problem was in the file extension typo. Thanks!

Identifying the required dependencies to support the REX-Ray binary and its transitive dependencies on a given version of Alpine For example, I got REX-Ray running on Alpine Standard with nearly zero hassle

I'm sure this was not a big deal for you as the maintainer of the project ;) However, sometimes walls are not that transparent for the rest of us (e.g. knowing which dependencies are required and are missing on a particular Alpine flavor). Looks like official Alpine support might be a matter of setup instructions and documentation.

And on the other side - running REX-Ray as a docker plugin in a container would eliminate all the hassle (when using it with Docker).

akutz commented 7 years ago

Hi @lmakarov,

I'm sure this was not a big deal for you as the maintainer of the project ;)

Actually I'd never heard of or used Alpine prior to yesterday. I simply adopted an organized approach, starting with Googling "go not found alpine". The first result illuminated the problem and illustrated the solution.

image

FWIW, as I showed @cduchesne yesterday, it's also worthwhile being aware of the following two utilities:

ldd

The ldd command will show the shared libraries on which a Linux library depends as well as what is missing:

~/scaleio-rhel7-sdc/bin/emc/scaleio # ldd drv_cfg 
    /lib64/ld-linux-x86-64.so.2 (0x7ff9f6760000)
    libuuid.so.1 => /lib/libuuid.so.1 (0x7ff9f655c000)
    libpthread.so.0 => /lib64/ld-linux-x86-64.so.2 (0x7ff9f6760000)
    librt.so.1 => /lib64/ld-linux-x86-64.so.2 (0x7ff9f6760000)
    libm.so.6 => /lib64/ld-linux-x86-64.so.2 (0x7ff9f6760000)
    libdl.so.2 => /lib64/ld-linux-x86-64.so.2 (0x7ff9f6760000)
    libaio.so.1 => /usr/lib/libaio.so.1 (0x7ff9f635a000)
    libnuma.so.1 => /usr/lib/libnuma.so.1 (0x7ff9f6150000)
    libc.so.6 => /lib64/ld-linux-x86-64.so.2 (0x7ff9f6760000)
Error relocating drv_cfg: backtrace_symbols: symbol not found
Error relocating drv_cfg: pthread_attr_setaffinity_np: symbol not found
Error relocating drv_cfg: __strtok_r: symbol not found

strace

The strace command will provide a real-time report of the syscalls attempted by a binary as it is being executed:

~/scaleio-rhel7-sdc/bin/emc/scaleio # strace ./drv_cfg 
execve("./drv_cfg", ["./drv_cfg"], [/* 7 vars */]) = 0
arch_prctl(ARCH_SET_FS, 0x7fb32cc24b48) = 0
set_tid_address(0x7fb32cc24b80)         = 99
open("/etc/ld-musl-x86_64.path", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/lib/libuuid.so.1", O_RDONLY|O_CLOEXEC) = 3
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
fstat(3, {st_mode=S_IFREG|0755, st_size=14136, ...}) = 0
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260\21\0\0\0\0\0\0"..., 960) = 960
mmap(NULL, 2113536, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x7fb32c795000
mmap(0x7fb32c997000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x2000) = 0x7fb32c997000
close(3)                                = 0
open("/lib/libaio.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/local/lib/libaio.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/lib/libaio.so.1", O_RDONLY|O_CLOEXEC) = 3
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
fstat(3, {st_mode=S_IFREG|0755, st_size=5368, ...}) = 0
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0\5\0\0\0\0\0\0"..., 960) = 960
mmap(NULL, 2105344, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x7fb32c593000
mmap(0x7fb32c793000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0x7fb32c793000
mmap(0x7fb32c794000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fb32c794000
close(3)                                = 0
open("/lib/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/local/lib/libnuma.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/lib/libnuma.so.1", O_RDONLY|O_CLOEXEC) = 3
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
fstat(3, {st_mode=S_IFREG|0755, st_size=39144, ...}) = 0
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\300.\0\0\0\0\0\0"..., 960) = 960
mmap(NULL, 2138112, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x7fb32c389000
mmap(0x7fb32c591000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x8000) = 0x7fb32c591000
close(3)                                = 0
mprotect(0x7fb32c997000, 4096, PROT_READ) = 0
mprotect(0x7fb32cc21000, 4096, PROT_READ) = 0
mprotect(0x7fb32c793000, 4096, PROT_READ) = 0
mprotect(0x7fb32c591000, 4096, PROT_READ) = 0
writev(2, [{iov_base="Error relocating ./drv_cfg: back"..., iov_len=63}, {iov_base=NULL, iov_len=0}], 2Error relocating ./drv_cfg: backtrace_symbols: symbol not found) = 63
writev(2, [{iov_base="\n", iov_len=1}, {iov_base=NULL, iov_len=0}], 2
) = 1
writev(2, [{iov_base="Error relocating ./drv_cfg: pthr"..., iov_len=73}, {iov_base=NULL, iov_len=0}], 2Error relocating ./drv_cfg: pthread_attr_setaffinity_np: symbol not found) = 73
writev(2, [{iov_base="\n", iov_len=1}, {iov_base=NULL, iov_len=0}], 2
) = 1
writev(2, [{iov_base="Error relocating ./drv_cfg: __st"..., iov_len=56}, {iov_base=NULL, iov_len=0}], 2Error relocating ./drv_cfg: __strtok_r: symbol not found) = 56
writev(2, [{iov_base="\n", iov_len=1}, {iov_base=NULL, iov_len=0}], 2
) = 1
mprotect(0x706000, 4096, PROT_READ)     = 0
arch_prctl(ARCH_SET_FS, 0x7fb32c592820) = 0
set_tid_address(0x7fb32c592858)         = 99
exit_group(127)                         = ?
+++ exited with 127 +++

My ability to figure out the issue had zero to do with me being the project lead. It was taking a step back, Googling, and availing myself of tools commonly used when debugging these types of issues.

akutz commented 7 years ago

Hi @lmakarov,

At the same time, I really appreciate you calling attention to this issue and making it a priority for the project. My goal above was to show that everyone has access to the same tools I used to solve this problem. Even if not this problem, hopefully you and others are able to use the aforementioned tools to solve similar issues in the future :)

lmakarov commented 7 years ago

@akutz thanks for pointing to ldd and strace. I'm sure I'll find a use for them soon.

I also run an open source project and stopped wondering why wouldn't everyone "just do some googling" and be able to solve their own issues. Everyone's problem solving and googling powers are still a function of their experience and time spent in a given field.

To sum it up - REX-Ray does work on Alpine (following your instructions) and it also does work as a Docker plugin with Docker for AWS (using the template and instructions from @kacole2's blog post)

Thanks for assistance! Feel free to close this issues unless you want to use it for further tracking.

akutz commented 7 years ago

Hi @lmakarov,

stopped wondering why wouldn't everyone "just do some googling" and be to solve their own issues.

You make a very valid and fair point here. I've been working professionally in this field for over 15 years, and I wasn't judging you or anyone else. I'm very aware that a person's own experience helps them solve issues.

I'm sure this was not a big deal for you as the maintainer of the project ;)

That was in your response to my remark about getting things running with nearly zero hassle. I do apologize as I believe my remark was taken out of context. I spent the time between posting my synopsis of the problem and how I solved by doing the things we've discussed -- googling, using the aforementioned tools, etc. The phrase with nearly zero hassle was more of a way that I was expressing joy or enthusiasm that things worked.

My intent was not to indicate that this was easy or that anyone was less than capable for not solving it themselves. I merely meant to illustrate that when I followed my own, suggested approach to solving the problem I was able to do so fairly painlessly. If anything I was trying to highlight that the approach I suggested did in fact work.

Anyway, I really wasn't trying to impune you or anyone else. Thank you again.

akutz commented 7 years ago

Hi @lmakarov,

FYI, I recommend against installing the glibc libraries in Alpine per the above example:

Solution - install the necessary glibc libraries

curl -LO https://github.com/sgerrand/alpine-pkg-glibc/releases/download/2.23-r3/glibc-2.23-r3.apk && sudo apk add --allow-untrusted glibc-2.23-r3.apk

The reason being is that the glibc libraries referenced, while they do work, are just core glibc. While this will solve the inability to launch a Golang binary, it will also present new issues.

Here is what happens when drv_cfg is executed using the musl symlinked as glibc approach:

# ./drv_cfg 
Error relocating ./drv_cfg: backtrace_symbols: symbol not found
Error relocating ./drv_cfg: pthread_attr_setaffinity_np: symbol not found
Error relocating ./drv_cfg: __strtok_r: symbol not found

The ldd output for drv_cfg verifies that the following symbols are in fact missing:

~/scaleio-rhel7-sdc/bin/emc/scaleio # ldd drv_cfg 
    /lib64/ld-linux-x86-64.so.2 (0x7ff9f6760000)
    libuuid.so.1 => /lib/libuuid.so.1 (0x7ff9f655c000)
    libpthread.so.0 => /lib64/ld-linux-x86-64.so.2 (0x7ff9f6760000)
    librt.so.1 => /lib64/ld-linux-x86-64.so.2 (0x7ff9f6760000)
    libm.so.6 => /lib64/ld-linux-x86-64.so.2 (0x7ff9f6760000)
    libdl.so.2 => /lib64/ld-linux-x86-64.so.2 (0x7ff9f6760000)
    libaio.so.1 => /usr/lib/libaio.so.1 (0x7ff9f635a000)
    libnuma.so.1 => /usr/lib/libnuma.so.1 (0x7ff9f6150000)
    libc.so.6 => /lib64/ld-linux-x86-64.so.2 (0x7ff9f6760000)
Error relocating drv_cfg: backtrace_symbols: symbol not found
Error relocating drv_cfg: pthread_attr_setaffinity_np: symbol not found
Error relocating drv_cfg: __strtok_r: symbol not found

Now, instead of using the musl for glibc approach, let's install the glibc libraries and use ldd:

~/scaleio-rhel7-sdc/bin/emc/scaleio # ldd drv_cfg 
    /lib64/ld-linux-x86-64.so.2 (0x7ff9f6760000)
    libuuid.so.1 => /lib/libuuid.so.1 (0x7ff9f655c000)
    libpthread.so.0 => /lib64/ld-linux-x86-64.so.2 (0x7ff9f6760000)
    librt.so.1 => /lib64/ld-linux-x86-64.so.2 (0x7ff9f6760000)
    libm.so.6 => /lib64/ld-linux-x86-64.so.2 (0x7ff9f6760000)
    libdl.so.2 => /lib64/ld-linux-x86-64.so.2 (0x7ff9f6760000)
    libaio.so.1 => /usr/lib/libaio.so.1 (0x7ff9f635a000)
    libnuma.so.1 => /usr/lib/libnuma.so.1 (0x7ff9f6150000)
    libc.so.6 => /lib64/ld-linux-x86-64.so.2 (0x7ff9f6760000)
Error relocating drv_cfg: backtrace_symbols: symbol not found
Error relocating drv_cfg: pthread_attr_setaffinity_np: symbol not found
Error relocating drv_cfg: __strtok_r: symbol not found

The above errors seem to be the same, right? But it's not. Watch what happens when drv_cfg is executed using glibc:

# ./drv_cfg 
./drv_cfg: error while loading shared libraries: libuuid.so.1: cannot open shared object file: No such file or directory

That's a different error! It's complaining about the inability to locate libuuid.so.1, right? Except the above ldd output shows that libuuid.so.1 is not missing. What gives? This is where strace is helpful:

# strace ./drv_cfg 
execve("./drv_cfg", ["./drv_cfg"], [/* 7 vars */]) = 0
brk(NULL)                               = 0x257f000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f09ad8cc000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/usr/glibc-compat/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=3808, ...}) = 0
mmap(NULL, 3808, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f09ad8cb000
close(3)                                = 0
open("/usr/glibc-compat/lib/tls/x86_64/libuuid.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/glibc-compat/lib/tls/x86_64", 0x7ffc85701d90) = -1 ENOENT (No such file or directory)
open("/usr/glibc-compat/lib/tls/libuuid.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/glibc-compat/lib/tls", 0x7ffc85701d90) = -1 ENOENT (No such file or directory)
open("/usr/glibc-compat/lib/x86_64/libuuid.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/glibc-compat/lib/x86_64", 0x7ffc85701d90) = -1 ENOENT (No such file or directory)
open("/usr/glibc-compat/lib/libuuid.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/glibc-compat/lib", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
writev(2, [{iov_base="./drv_cfg", iov_len=9}, {iov_base=": ", iov_len=2}, {iov_base="error while loading shared libra"..., iov_len=36}, {iov_base=": ", iov_len=2}, {iov_base="libuuid.so.1", iov_len=12}, {iov_base=": ", iov_len=2}, {iov_base="cannot open shared object file", iov_len=30}, {iov_base=": ", iov_len=2}, {iov_base="No such file or directory", iov_len=25}, {iov_base="\n", iov_len=1}], 10./drv_cfg: error while loading shared libraries: libuuid.so.1: cannot open shared object file: No such file or directory
) = 121
exit_group(127)                         = ?
+++ exited with 127 +++

The following lines from the above output illustrate the problem:

open("/usr/glibc-compat/lib/tls/x86_64/libuuid.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/glibc-compat/lib/tls/x86_64", 0x7ffc85701d90) = -1 ENOENT (No such file or directory)
open("/usr/glibc-compat/lib/tls/libuuid.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/glibc-compat/lib/tls", 0x7ffc85701d90) = -1 ENOENT (No such file or directory)
open("/usr/glibc-compat/lib/x86_64/libuuid.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/glibc-compat/lib/x86_64", 0x7ffc85701d90) = -1 ENOENT (No such file or directory)
open("/usr/glibc-compat/lib/libuuid.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)

Because the program is using the glibc shared library ld-linux-x86-64.so.2, the program is also attempting to use the dependencies the shared library expects. That means while the ldd command may show that libuuid.so.1 is present at /lib/libuuid.so.1, that's a version of libuuid linked against musl, not glibc. When running the program with glibc, the strace output shows that the program is attempting to use a version of libuuid linked against glibc, and there isn't one.

I just wanted to point this out to show that it's mostly preferred to attempt to patch musl or build the required dependencies against musl instead of using glibc since the latter also means that a program's other direct and transitive dependencies have to be linked to glibc as well.

FWIW, I already patched musl to support the missing symbols pthread_attr_setaffinity_np and __strtok_r:

index bba9587..6903327 100644
--- a/include/pthread.h
+++ b/include/pthread.h
@@ -143,6 +143,7 @@ int pthread_setspecific(pthread_key_t, const void *);
 int pthread_attr_init(pthread_attr_t *);
 int pthread_attr_destroy(pthread_attr_t *);

+int pthread_attr_setaffinity_np(pthread_attr_t *attr, size_t cpusetsize, const cpu_set_t *cpuset);
 int pthread_attr_getguardsize(const pthread_attr_t *__restrict, size_t *__restrict);
 int pthread_attr_setguardsize(pthread_attr_t *, size_t);
 int pthread_attr_getstacksize(const pthread_attr_t *__restrict, size_t *__restrict);
diff --git a/include/string.h b/include/string.h
index ff9badb..c17e134 100644
--- a/include/string.h
+++ b/include/string.h
@@ -61,6 +61,7 @@ char *strerror (int);
  || defined(_XOPEN_SOURCE) || defined(_GNU_SOURCE) \
  || defined(_BSD_SOURCE)
 char *strtok_r (char *__restrict, const char *__restrict, char **__restrict);
+#define __strtok_r(s,sep,p) strtok_r(s,sep,p)
 int strerror_r (int, char *, size_t);
 char *stpcpy(char *__restrict, const char *__restrict);
 char *stpncpy(char *__restrict, const char *__restrict, size_t);
diff --git a/src/thread/pthread_attr_set.c b/src/thread/pthread_attr_set.c
new file mode 100644
index 0000000..d236d12
--- /dev/null
+++ b/src/thread/pthread_attr_set.c
@@ -0,0 +1,6 @@
+#include "pthread_impl.h"
+
+int pthread_attr_setaffinity_np(pthread_attr_t *attr, size_t cpusetsize, const cpu_set_t *cpuset)
+{
+       return 0;
+}

I am working on libbacktrace, but that library is not small, and I'm still having some difficulty porting it to link to musl.

At the same time I'm also building glibc from source from a fork of the Docker image on which the Alpine glibc install is generated, https://github.com/akutz/docker-glibc-builder. I'm using this to patch in support for building libuuid, libaio, and libnuma. It's taking a while as it is due to their transitive dependencies. Ultimately I may just tell @cduchesne to use CentOS-minimal for ScaleIO's base image as without the SIO sources to recompile, it's a real PITA to work around the dynamic linking.

I hope this helps to show why gblic may not be the best approach on Alpine.

akutz commented 7 years ago

And at the same time, I just managed to get ScaleIO's drv_cfg working on Alpine using a combination of glibc and musl, so don't listen to me, I'm full of crap apparently :)

akutz commented 7 years ago

HI @lmakarov,

FWIW, you, @cduchesne, @codenrhoden, and others should be aware that a Docker container must be launched with --privileged in order to use strace. Even SYS_PTRACE isn't enough of an entitlement.

pascalandy commented 6 years ago

Hi @akutz,

I ran your instructions above on play-with-docker and I can't start REX-ray.

start REX-Ray, with a custom data root, as a foreground service

REXRAY_HOME=/var/opt/rexray /var/opt/rexray/bin/rexray start -f;

See my gist here: https://gist.github.com/pascalandy/4d1b726762d15603eec05d9ee733d57c