paulscherrerinstitute / pcaspy

Portable Channel Access Server in Python
BSD 3-Clause "New" or "Revised" License
32 stars 24 forks source link

pcaspy in docker, "server isnt attached to a network - unable to continue" #38

Closed jsparger closed 7 years ago

jsparger commented 7 years ago

This is probably more of an issue with cas than pcaspy, so feel free to close if this is not an appropriate place to ask.

If I try to run pcaspy in docker, I get the following error:

filename="../../../../src/cas/io/bsdSocket/casDGIntfIO.cc" line number=121
server isnt attached to a network - unable to continue
filename="../../../../src/cas/io/bsdSocket/caServerIO.cc" line number=116
Attempt to set server's IP address/port failed unable to attach explicit interface
filename="../../../../src/cas/generic/caServerI.cc" line number=59
server isnt attached to a network - CA server internals init unable to continue
terminate called after throwing an instance of 'int'

I understand that this has something to do with cas's interface detection. However, if I run softIOC with the following db

echo "record(ai,"test_pv") {
field(VAL,"3")
}" > test.db

softIOC -d test.db

It works fine and I can caget testpv from another container without issue. This leads me to believe that the plumbing is there but that there is some assumption made in how cas detects broadcast interfaces that is not holding true in docker.

Have you all been able to successfully run pcaspy in docker? Are there some environment variables I can set to help cas along?

I should note that setting EPICS_CAS_INTF_ADDR_LIST="localhost" WILL allow pcaspy will run in docker, but then the PVs are not visible outside that docker container, so it's not that useful.

Also for reference, here is what ifconfig returns from inside the container:

ifconfig
eth0      Link encap:Ethernet  HWaddr 02:42:ac:11:00:02  
          inet addr:172.17.0.2  Bcast:0.0.0.0  Mask:255.255.0.0
          inet6 addr: fe80::42:acff:fe11:2/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:237 errors:0 dropped:0 overruns:0 frame:0
          TX packets:186 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:240235 (234.6 KiB)  TX bytes:11013 (10.7 KiB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:3 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1 
          RX bytes:208 (208.0 B)  TX bytes:208 (208.0 B)

Thanks for the help. I will also ask on the mailing list, so please close this if it is not relevant.

xiaoqiangwang commented 7 years ago

I see the same problem as you. It binds to the localhost but not on eth0. But I suspect that might need some settings from docker side.

jsparger commented 7 years ago

I should also mention I am using EPICS base v3.14.12.6

xiaoqiangwang commented 7 years ago

The closest problem I found was from this post http://www.aps.anl.gov/epics/tech-talk/2013/msg01377.php, but I was exactly suggesting using localhost.

Since this is relevant to PCAS servers, e.g. same problem with the excas from EPICS base, I would hope it could be answered at tech-talk mailing list.

jsparger commented 7 years ago

Finally!

I found a fix which involves changing the broadcast address of the interface in the container from 0.0.0.0 to the correct broadcast address of 224.0.55.55 using ifconfig. I am still not sure why the broadcast address is initially set to 0.0.0.0. I'm also not sure why the broadcast address that works is 224.0.55.55. Maybe that is apparent to someone with more knowledge. I found that address in this docker issue https://github.com/docker/docker/issues/3043 in a comment by rhasselbaum where he shows how to use iperf to do multicast between containers.

Anyway, for a debian container you can run pcaspy like so:

On host:

# start the container you want to run the server in
# NET_ADMIN capabilities are needed to set the broadcast address
docker run --name=my_server -it --cap-add=NET_ADMIN Your/Epics-Image /bin/bash

# start a container to be the client
docker run --link=my_server -it Your/Epics-Image /bin/bash

In server container:

# get ifconfig
apt install net-tools 
# set the broadcast address for the network
ifconfig eth0 broadcast 224.0.55.55
# start caRepeater (to warning messages)
caRepeater &
# run your pcaspy server. Let's pretend it serves the PV Docker:Test
python your_pcaspy_server.py

In client container:

export EPICS_CA_ADDR_LIST="my_server"
caget Docker:Test

But the reasons it wasn't working are

1) 0.0.0.0 is not the real broadcast address for the network. 2) cas explicitly rejects the 0.0.0.0 address when it is vetting the interfaces, so it wouldn't make it through even if it was right.

Anyway, hopefully this will help someone trying to do this in the future.

xiaoqiangwang commented 7 years ago

I see the key is the --cap-add=NET_ADMIN passing to docker. I had tried changing the broadcast address but got permission denied errors. The exact broadcast address itself does not seem to play a role though. I have set it to 172.17.255.255, which seems correct based the netmask 255.255.0.0. And then I followed your procedure to launch another docker client and did caget on that PV. It all works.

But from the host, the PV is not visible. Maybe some other settings are necessary?

xiaoqiangwang commented 7 years ago

To access the docker IOC from host, passing -p 5064:5064 -p 5064:5064/udp shows to work. This would bind docker IOC ports to host's localhost. Other IOCs on host should use different port, e.g. export EPICS_CA_SERVER_PORT=50001.