TritonDataCenter / illumos-joyent

Community developed and maintained version of the OS/Net consolidation
http://www.illumos.org/projects/illumos-gate
266 stars 109 forks source link

Network Utilities not behaving properly in LX #167

Open Smithx10 opened 6 years ago

Smithx10 commented 6 years ago

Seems the following commands are behaving strangely on LX. nmap nslookup ping iperf3

I decided to test them all on the following lx branded zone images.

Platform:

[root@smartos01 /zones/smith]# uname -a
SunOS smartos01 5.11 joyent_20180329T002644Z i86pc i386 i86pc

Images:

[root@smartos01 /zones/smith]# imgadm list
UUID                                  NAME          VERSION   OS     TYPE        PUB
19aa3328-0025-11e7-a19a-c39077bfd4cf  alpine-3      20170303  linux  lx-dataset  2017-03-03
7b5981c4-1889-11e7-b4c5-3f3bdfc9b88b  ubuntu-16.04  20170403  linux  lx-dataset  2017-04-03
3dbbdcca-2eab-11e8-b925-23bf77789921  centos-7      20180323  linux  lx-dataset  2018-03-23
d3a765d6-381c-11e8-a2f0-8b4f8f1f2cca  debian-9      20180404  linux  lx-dataset  2018-04-04

VM's running those images:

[root@smartos01 /zones/smith]# vmadm list
UUID                                  TYPE  RAM      STATE             ALIAS
83cdfbec-deb1-c2a9-b717-88d09f053328  LX    2048     running           alpine
b38a0cc7-83ba-636e-9dc2-c526ba6569f6  LX    2048     running           centos
c149701e-2df6-699c-ee55-bc690b53211f  LX    2048     running           debian
e655f498-8aa1-6fad-e5b0-9bd3b54fcdaa  LX    2048     running           ubuntu

The Debian Behavior:

Debian image : d3a765d6-381c-11e8-a2f0-8b4f8f1f2cca
root@c149701e-2df6-699c-ee55-bc690b53211f:~# cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
NAME="Debian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

root@c149701e-2df6-699c-ee55-bc690b53211f:~# nslookup google.com
../../../../lib/isc/unix/socket.c:2881: setsockopt(20, IP_RECVTOS) failed: Protocol not available
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
Name:   google.com
Address: 172.217.13.238

root@c149701e-2df6-699c-ee55-bc690b53211f:~# nmap -p 3000 google.com -Pn

Starting Nmap 7.40 ( https://nmap.org ) at 2018-04-05 19:03 UTC
dnet: Failed to open device eth0
QUITTING!
root@c149701e-2df6-699c-ee55-bc690b53211f:~# ping google.com
connect: Network is unreachable
root@c149701e-2df6-699c-ee55-bc690b53211f:~# ping -4 google.com
ping: socket: Protocol not supported

root@c149701e-2df6-699c-ee55-bc690b53211f:~# iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.227.202, port 65351
[  5] local 192.168.227.200 port 5201 connected to 192.168.227.202 port 53935
iperf3: getsockopt - Protocol not available
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-1.00   sec   581 MBytes  4.87 Gbits/sec
iperf3: getsockopt - Protocol not available
[  5]   1.00-2.00   sec   508 MBytes  4.26 Gbits/sec

root@c149701e-2df6-699c-ee55-bc690b53211f:~# iperf3 -c 192.168.227.202
Connecting to host 192.168.227.202, port 5201
[  4] local 192.168.227.200 port 53745 connected to 192.168.227.202 port 5201
iperf3: getsockopt - Protocol not available

The Ubuntu Behavior:

root@e655f498-8aa1-6fad-e5b0-9bd3b54fcdaa:~# cat /etc/os-release
NAME="Ubuntu"
VERSION="16.04.2 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.2 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial

root@e655f498-8aa1-6fad-e5b0-9bd3b54fcdaa:~# nslookup google.com
../../../../lib/isc/unix/socket.c:2881: setsockopt(20, IP_RECVTOS) failed: Protocol not available
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
Name:   google.com
Address: 172.217.15.78

root@e655f498-8aa1-6fad-e5b0-9bd3b54fcdaa:~# nmap -p 3000 google.com -Pn

Starting Nmap 7.01 ( https://nmap.org ) at 2018-04-05 19:32 UTC
dnet: Failed to open device eth0
QUITTING!

root@e655f498-8aa1-6fad-e5b0-9bd3b54fcdaa:~# ping google.com
PING google.com (172.217.7.174) 56(84) bytes of data.
64 bytes from iad30s09-in-f174.1e100.net (172.217.7.174): icmp_seq=1 ttl=128 time=9.75 ms
64 bytes from iad30s09-in-f174.1e100.net (172.217.7.174): icmp_seq=2 ttl=128 time=8.83 ms

root@e655f498-8aa1-6fad-e5b0-9bd3b54fcdaa:~# iperf3 -c 192.168.227.200
Connecting to host 192.168.227.200, port 5201
[  4] local 192.168.227.202 port 64358 connected to 192.168.227.200 port 5201
^Ciperf3: getsockopt - Protocol not available
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-0.56   sec   348 MBytes  5.24 Gbits/sec    0   0.00 Bytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-0.56   sec   348 MBytes  5.24 Gbits/sec    0             sender
[  4]   0.00-0.56   sec  0.00 Bytes  0.00 bits/sec                  receiver
iperf3: interrupt - the client has terminated
root@e655f498-8aa1-6fad-e5b0-9bd3b54fcdaa:~# iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
^Ciperf3: interrupt - the server has terminated
root@e655f498-8aa1-6fad-e5b0-9bd3b54fcdaa:~# iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.227.200, port 34870
[  5] local 192.168.227.202 port 5201 connected to 192.168.227.200 port 53745
iperf3: getsockopt - Protocol not available
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-1.00   sec   644 MBytes  5.41 Gbits/sec

The CentOS Behavior

[root@b38a0cc7-83ba-636e-9dc2-c526ba6569f6 ~]# cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

[root@b38a0cc7-83ba-636e-9dc2-c526ba6569f6 ~]# nslookup google.com
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
Name:   google.com
Address: 172.217.13.238

[root@b38a0cc7-83ba-636e-9dc2-c526ba6569f6 ~]# ping google.com
PING google.com (172.217.7.174) 56(84) bytes of data.
64 bytes from iad30s09-in-f14.1e100.net (172.217.7.174): icmp_seq=1 ttl=128 time=10.2 ms
^C
--- google.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 10.217/10.217/10.217/0.000 ms
[root@b38a0cc7-83ba-636e-9dc2-c526ba6569f6 ~]# nmap -p 3000 google.com -Pn

Starting Nmap 6.40 ( http://nmap.org ) at 2018-04-05 19:39 UTC
dnet: Failed to open device eth0
QUITTING!

[root@b38a0cc7-83ba-636e-9dc2-c526ba6569f6 ~]# iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.227.202, port 42752
iperf3: error - unable to set TCP_CONGESTION: Supplied congestion control algorithm not supported on this host
iperf3: error - unable to start listener for connections: Address already in use
iperf3: exiting
[root@b38a0cc7-83ba-636e-9dc2-c526ba6569f6 ~]# iperf3 -c 192.168.227.202
Connecting to host 192.168.227.202, port 5201
iperf3: error - unable to set TCP_CONGESTION: Supplied congestion control algorithm not supported on this host

The Alpine Behavior

83cdfbec-deb1-c2a9-b717-88d09f053328:~# cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.5.2
PRETTY_NAME="Alpine Linux v3.5"
HOME_URL="http://alpinelinux.org"
BUG_REPORT_URL="http://bugs.alpinelinux.org"
83cdfbec-deb1-c2a9-b717-88d09f053328:~# nslookup google.com
nslookup: can't resolve '(null)': Name does not resolve

Name:      google.com
Address 1: 172.217.13.238 iad23s61-in-f14.1e100.net
Address 2: 2607:f8b0:4004:810::200e iad23s63-in-x0e.1e100.net
83cdfbec-deb1-c2a9-b717-88d09f053328:~# ping google.com
PING google.com (172.217.7.174): 56 data bytes
64 bytes from 172.217.7.174: seq=0 ttl=128 time=8.527 ms
64 bytes from 172.217.7.174: seq=1 ttl=128 time=9.087 ms
^C
--- google.com ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 8.527/8.807/9.087 ms
83cdfbec-deb1-c2a9-b717-88d09f053328:~# nmap -p 3000 google.com -Pn

Starting Nmap 7.40 ( https://nmap.org ) at 2018-04-05 19:42 UTC
dnet: Failed to open device eth0
QUITTING!

83cdfbec-deb1-c2a9-b717-88d09f053328:~# iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.227.202, port 44933
[  5] local 192.168.227.205 port 5201 connected to 192.168.227.202 port 34527
iperf3: getsockopt - Protocol not available
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-1.00   sec   275 MBytes  2.31 Gbits/sec
iperf3: getsockopt - Protocol not available
[  5]   1.00-2.00   sec   248 MBytes  2.08 Gbits/sec
[  5]   1.00-2.00   sec   248 MBytes  2.08 Gbits/sec

83cdfbec-deb1-c2a9-b717-88d09f053328:~# iperf3 -c 192.168.227.202
Connecting to host 192.168.227.202, port 5201
[  4] local 192.168.227.205 port 58287 connected to 192.168.227.202 port 5201
iperf3: getsockopt - Protocol not available
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec   289 MBytes  2.42 Gbits/sec  32767   -65011712.00 Bytes
iperf3: getsockopt - Protocol not available
chrisjmccrum commented 6 years ago

verified that im getting the same behavior on Ubuntu LX

ovirtual commented 6 years ago

Same behavior on my Ubuntu and Debian LX

teutat3s commented 6 years ago

Partly related to https://github.com/iputils/iputils/issues/129

Smithx10 commented 6 years ago

@Teutone I saw there is a PR opened on August 16, is there anything stopping it from getting merged?

teutat3s commented 6 years ago

I don't think so - just normal merge process, I don't know how long it takes them. I tried the changes in lx-brand instances (self-build ping from iputils with mentioned fix) and there were no issues. Working as expected.

axisofentropy commented 5 years ago

I'm also seeing this kind of error. socket.c setsockopt(20, IP_RECVTOS) failed: Protocol not available

rzezeski commented 5 years ago

There are various issues here. E.g., we have no support for the IP_RECVTOS option in the native kernel, but adding support for it probably wouldn't be too bad. As for the other issues: I've have to fire up some lx zones and look more closely. In any event I'll assign this to myself and see if I can't at least file proper issues away and maybe even fix some of them.

Smithx10 commented 5 years ago

Quick Update,

iperf3 has stopped working for me in CentOS LX.

[root@7abb8f3f-697d-4846-f348-c88cf605257f ~]# iperf3 -c xx.xx.xx.xx Connecting to host xx.xx.xx.xx, port 5201 iperf3: error - unable to set TCP_CONGESTION: Supplied congestion control algorithm not supported on this host

On paltform SunOS ac-1f-6b-a5-af-5a 5.11 joyent_20190718T005708Z i86pc i386 i86pc

Using this image: 3dbbdcca-2eab-11e8-b925-23bf77789921

danmcd commented 5 years ago

@Smithx10 - did you recompile iperf3 or update it from pkgsrc/other-upstream? I've seen (mis)behavior like this when a program assumes existence of X implies existence of Y. In your case Y == TCP_CONGESTION, but it's not at all clear what X is.

Smithx10 commented 5 years ago

@danmcd

I used

yum install -y iperf3

In centos lx. Image 3dbbdcca-2eab-11e8-b925-23bf77789921.

danmcd commented 5 years ago

Okay, so TCP_CONGESTION did show up v. recently. I'm curious about what parameter was passed for the LX socket option? Maybe Linux does that differently than us (we use a "struct cc_algo") and we have to account for that?

melloc commented 5 years ago

I just integrated OS-7427 which adds support for TCP_CONGESTION and several other congestion control interfaces, so iperf should be able to run on the release later this week. (There are still some issues with getsockopt(TCP_INFO); OS-4525 covers that.)

blackwood821 commented 4 years ago

I receive the following error as well when attempting to start the bind service on LX CentOS 7 (3dbbdcca-2eab-11e8-b925-23bf77789921):

setsockopt(519, IP_RECVTOS) failed: Protocol not available
danmcd commented 4 years ago

I receive the following error as well when attempting to start the bind service on LX CentOS 7 (3dbbdcca-2eab-11e8-b925-23bf77789921):

setsockopt(519, IP_RECVTOS) failed: Protocol not available

Sounds like a new issue. Please file and link here?

blackwood821 commented 4 years ago

Sounds like a new issue. Please file and link here?

Sure. https://github.com/joyent/illumos-joyent/issues/330

Adel-Magebinary commented 2 years ago

This is also happening to php-fpm at line https://github.com/php/php-src/blob/df4c27642efb2ee986ad7fb744f76ab380cf5a4c/sapi/fpm/fpm/fpm_sockets.c#L501

danmcd commented 2 years ago

We need TCP_INFO. This is an illumos issue too...

danmcd commented 2 years ago

https://www.illumos.org/issues/14744 tracks this. No guarantees on whether or not this makes it upstream. Once it does, we'll need to plumb it straight into LX. Anyone who's interested in tackling this should try and get it ready for -gate first, but if you wish to put it straight into illumos-joyent, I expect LX plumbing to arrive with it.

smokris commented 2 years ago

@Adel-Magebinary FWIW, php-fpm works well under SmartOS LX if you configure it to communicate with the web server via a Unix domain socket (i.e. a filesystem socket, rather than an inet socket).

In the php-fpm config:

[somePoolName]
listen = /var/run/php-fpm/somePoolName
listen.backlog = 256
…

In the httpd config:

<VirtualHost …>
  <Directory …>
    SetHandler proxy:unix:/var/run/php-fpm/somePoolName|fcgi://localhost/
    …
Adel-Magebinary commented 2 years ago

@smokris, thanks for that. For some single-used / locally used fpm services, we used the socket instead of TCP. However, in our case, we need fpm to be available for other nodes.

eg

upstream m2_fastcgi_backend {
    ip_hash
    server unix:/var/run/php-fpm.sock;
    server fpm-01.cns  backup resolve;
    server fpm-02.cns  backup resolve;
    server fpm-03.cns backup resolve;
}

whereas fpm-01.cns needs to listen to TCP.

Adel-Magebinary commented 2 years ago

This also happens in MySQL clusters.

2022-06-17T04:08:08.424303Z 0 [Note] [MY-000000] [WSREP-SST] Cannot open netlink socket: Protocol not supported 2022-06-17T04:08:08.424355Z 0 [Note] [MY-000000] [WSREP-SST] Cannot open netlink socket: Protocol not supported 2022-06-17T04:08:08.424388Z 0 [Note] [MY-000000] [WSREP-SST] Cannot open netlink socket: Protocol not supported 2022-06-17T04:08:08.657658Z 0 [Note] [MY-000000] [WSREP-SST] Cannot open netlink socket: Protocol not supported 2022-06-17T04:08:08.659807Z 0 [Note] [MY-000000] [WSREP-SST] Cannot open netlink socket: Protocol not supported 2022-06-17T04:08:08.659869Z 0 [Note] [MY-000000] [WSREP-SST] Cannot open netlink socket: Protocol not supported 2022-06-17T04:08:08.660141Z 0 [Note] [MY-000000] [WSREP-SST] Cannot open netlink socket: Protocol not supporte

danmcd commented 2 years ago

@Adel-Magebinary netlink sockets are a whole other world, and LX isn't necessarily designed to keep up there.

I wonder if MySQL (or mariadb) via pkgsrc on a native zone might suit you better? It is available via pkgsrc...

mysql-client-5.6.51nb1  MySQL 5, a free SQL database (client)
mysql-connector-c++-8.0.25  Standardized MySQL database driver for C++ development
mysql-server-5.6.51nb1  MySQL 5, a free SQL database (server)
jperkin commented 2 years ago

I'd recommend percona-cluster on native, it's kept reasonably up-to-date and supports the wsrep/sst stuff.