Closed SchSeba closed 5 years ago
@SchSeba I currently investigating what is the effort to support cri-o engine. I'll make an update soon
I submitted a WIP patch that add the support for runc. Tested on a minikube cri-o env
Thanks for the update if there is anything I can help you with just tell me.
@SchSeba we just released a new version of Skydive 0.21, with the support of podman/runc which should work with cri-o too
https://hub.docker.com/r/skydive/skydive/tags/
Could you have a look ?
@safchain Sure I will thanks!
Hi @safchain I want to update you that this is not working for me.
I need to update the openshift deployment to change the config on the agents.
agent:
topology:
probes:
- ovsdb
- docker
- runc
analyzer:
listen: 0.0.0.0:8082
but I still don't see the pods or containers in the UI.
deployment file
apiVersion: v1
kind: Template
metadata:
name: skydive
objects:
- apiVersion: v1
kind: ConfigMap
metadata:
labels:
app: skydive-analyzer
name: skydive-analyzer-config
data:
SKYDIVE_ANALYZER_FLOW_BACKEND: elasticsearch
SKYDIVE_ANALYZER_TOPOLOGY_BACKEND: elasticsearch
SKYDIVE_ANALYZER_TOPOLOGY_PROBES: ""
SKYDIVE_ETCD_LISTEN: 0.0.0.0:12379
- apiVersion: v1
data:
skydive.yml: |
agent:
topology:
probes:
- ovsdb
- docker
- runc
analyzer:
listen: 0.0.0.0:8082
kind: ConfigMap
metadata:
labels:
app: skydive-agent
name: skydive-agent-config
- apiVersion: v1
kind: Service
metadata:
labels:
app: skydive-analyzer
name: skydive-analyzer
spec:
ports:
- name: api
port: 8082
protocol: TCP
targetPort: 8082
- name: protobuf
port: 8082
protocol: UDP
targetPort: 8082
- name: etcd
port: 12379
protocol: TCP
targetPort: 12379
- name: etcd-cluster
port: 12380
protocol: TCP
targetPort: 12380
- name: es
port: 9200
protocol: TCP
targetPort: 9200
selector:
app: skydive
tier: analyzer
sessionAffinity: None
type: NodePort
- apiVersion: v1
kind: DeploymentConfig
metadata:
name: skydive-analyzer
spec:
replicas: 1
selector:
app: skydive
tier: analyzer
strategy:
rollingParams:
intervalSeconds: 1
maxSurge: 25%
maxUnavailable: 25%
timeoutSeconds: 600
updatePeriodSeconds: 1
type: Rolling
template:
metadata:
labels:
app: skydive
tier: analyzer
spec:
containers:
- args:
- analyzer
- --listen=0.0.0.0:8082
envFrom:
- configMapRef:
name: skydive-analyzer-config
image: skydive/skydive
imagePullPolicy: Always
livenessProbe:
failureThreshold: 3
tcpSocket:
port: 8082
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
name: skydive-analyzer
ports:
- containerPort: 8082
protocol: TCP
- containerPort: 8082
protocol: UDP
- containerPort: 12379
protocol: TCP
- containerPort: 12380
protocol: TCP
readinessProbe:
failureThreshold: 1
tcpSocket:
port: 8082
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
- image: elasticsearch:5
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
tcpSocket:
port: 9200
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
name: skydive-elasticsearch
ports:
- containerPort: 9200
protocol: TCP
readinessProbe:
failureThreshold: 1
tcpSocket:
port: 9200
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
securityContext:
privileged: true
dnsPolicy: ClusterFirst
restartPolicy: Always
terminationGracePeriodSeconds: 30
test: false
triggers:
- type: ConfigChange
- apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
labels:
app: skydive
tier: agent
name: skydive-agent
spec:
selector:
matchLabels:
app: skydive
tier: agent
template:
metadata:
labels:
app: skydive
tier: agent
spec:
containers:
- args:
- agent
env:
- name: SKYDIVE_ANALYZERS
value: $(SKYDIVE_ANALYZER_SERVICE_HOST):$(SKYDIVE_ANALYZER_SERVICE_PORT_API)
envFrom:
- configMapRef:
name: skydive-agent-config
image: skydive/skydive
imagePullPolicy: Always
name: skydive-agent
ports:
- containerPort: 8081
hostPort: 8081
protocol: TCP
securityContext:
privileged: true
volumeMounts:
- mountPath: /var/run/docker.sock
name: docker
- mountPath: /host/run
name: run
- mountPath: /var/run/openvswitch/db.sock
name: ovsdb
- mountPath: /var/run/runc
name: crio
- name: agent-config
mountPath: /etc/skydive.yml
subPath: skydive.yml
dnsPolicy: ClusterFirst
hostNetwork: true
hostPID: true
restartPolicy: Always
terminationGracePeriodSeconds: 30
volumes:
- name: agent-config
configMap:
name: skydive-agent-config
- hostPath:
path: /var/run/docker.sock
name: docker
- hostPath:
path: /var/run/netns
name: run
- hostPath:
path: /var/run/openvswitch/db.sock
name: ovsdb
- hostPath:
path: /var/run/crio/
name: crio
- apiVersion: v1
kind: Route
metadata:
labels:
app: skydive-analyzer
name: skydive-analyzer
spec:
port:
targetPort: api
to:
kind: Service
name: skydive-analyzer
weight: 100
wildcardPolicy: None
agent log:
2018-12-10T12:35:19.269Z INFO agent/agent.go:46 glob..func1 cnv-executor-myakove-node1.example.com: Skydive Agent 0.21.0-b107adc75b5c starting...
2018-12-10T12:35:19.269Z INFO http/server.go:109 (*Server).Listen cnv-executor-myakove-node1.example.com: Listening on socket 127.0.0.1:8081
2018-12-10T12:35:19.280Z INFO agent/probes.go:49 NewTopologyProbeBundleFromConfig cnv-executor-myakove-node1.example.com: Topology probes: [ovsdb docker runc]
2018-12-10T12:35:19.281Z INFO probes/probes.go:67 NewFlowProbeBundle cnv-executor-myakove-node1.example.com: Flow probes: [pcapsocket ovssflow sflow gopacket dpdk ebpf ovsmirror]
2018-12-10T12:35:19.281Z INFO probes/probes.go:117 NewFlowProbeBundle cnv-executor-myakove-node1.example.com: Not compiled with dpdk support, skipping it
2018-12-10T12:35:19.413Z ERROR probes/probes.go:115 NewFlowProbeBundle cnv-executor-myakove-node1.example.com: Failed to create ebpf probe: Unable to load eBPF elf binary (host amd64) from binda
ta: error while loading "socket_flow_table" (invalid argument)
2018-12-10T12:35:19.420Z INFO probes/ovsmirror.go:427 (*OvsMirrorProbesHandler).cleanupOvsMirrors cnv-executor-myakove-node1.example.com: OvsMirror cleanup previous mirrors
2018-12-10T12:35:19.429Z INFO ovs/ovsdb.go:311 (*OvsMonitor).portAdded cnv-executor-myakove-node1.example.com: New port "veth9a6b1551(1532eb5d-d3bb-4614-8ff5-34757322fa04)" added
2018-12-10T12:35:19.429Z INFO ovs/ovsdb.go:311 (*OvsMonitor).portAdded cnv-executor-myakove-node1.example.com: New port "veth0909f89e(300b5284-12f2-4f68-9f6a-3993e047d009)" added
...........
please tell me if you need something else.
Can you please point me a documentation in order to install a similar environment ?
Hi @safchain,
this is what I did to deploy the environment to check the PR
git clone https://github.com/openshift/openshift-ansible.git -b v3.11.0 --depth 1
overwrite this : <master_ip>
all:
vars:
openshift_use_crio: 'true'
olm_operator_image: quay.io/coreos/olm:master-08ea39b7
olm_catalog_operator_image: quay.io/coreos/catalog:master-57dd618d
children:
OSEv3:
hosts:
node01:
openshift_ip: <master_ip>
openshift_node_group_name: node-config-master-infra-kubevirt
openshift_schedulable: true
children:
masters:
hosts:
<master_ip>:
nodes:
hosts:
<master_ip>:
nfs:
hosts:
<master_ip>:
etcd:
hosts:
<master_ip>:
vars:
ansible_service_broker_registry_whitelist:
- .*-apb$
ansible_service_broker_image: docker.io/ansibleplaybookbundle/origin-ansible-service-broker:ansible-service-broker-1.2.17-1
ansible_ssh_pass: vagrant
ansible_ssh_user: root
deployment_type: origin
openshift_clock_enabled: true
openshift_deployment_type: origin
openshift_disable_check: memory_availability,disk_availability,docker_storage,package_availability,docker_image_availability
openshift_hosted_etcd_storage_access_modes:
- ReadWriteOnce
openshift_hosted_etcd_storage_kind: nfs
openshift_hosted_etcd_storage_labels:
storage: etcd
openshift_hosted_etcd_storage_nfs_directory: /opt/etcd-vol
openshift_hosted_etcd_storage_nfs_options: '*(rw,root_squash,sync,no_wdelay)'
openshift_hosted_etcd_storage_volume_name: etcd-vol
openshift_hosted_etcd_storage_volume_size: 1G
openshift_image_tag: v3.11.0
openshift_master_admission_plugin_config:
MutatingAdmissionWebhook:
configuration:
apiVersion: v1
disable: false
kind: DefaultAdmissionConfig
ValidatingAdmissionWebhook:
configuration:
apiVersion: v1
disable: false
kind: DefaultAdmissionConfig
openshift_master_identity_providers:
- challenge: 'true'
kind: AllowAllPasswordIdentityProvider
login: 'true'
name: allow_all_auth
osm_api_server_args:
feature-gates:
- BlockVolume=true
osm_controller_args:
feature-gates:
- BlockVolume=true
openshift_node_groups:
- name: node-config-master-infra-kubevirt
labels:
- node-role.kubernetes.io/master=true
- node-role.kubernetes.io/infra=true
- node-role.kubernetes.io/compute=true
edits:
- key: kubeletArguments.feature-gates
value:
- RotateKubeletClientCertificate=true,RotateKubeletServerCertificate=true,BlockVolume=true
- key: kubeletArguments.max-pods
value:
- '40'
- key: kubeletArguments.pods-per-core
value:
- '40'
- name: node-config-compute-kubevirt
labels:
- node-role.kubernetes.io/compute=true
edits:
- key: kubeletArguments.feature-gates
value:
- RotateKubeletClientCertificate=true,RotateKubeletServerCertificate=true,BlockVolume=true,CPUManager=true
- key: kubeletArguments.cpu-manager-policy
value:
- static
- key: kubeletArguments.system-reserved
value:
- cpu=500m
- key: kubeletArguments.kube-reserved
value:
- cpu=500m
- key: kubeletArguments.max-pods
value:
- '40'
- key: kubeletArguments.pods-per-core
value:
- '40'
then
ansible-playbook -e "ansible_user=root ansible_ssh_pass=vagrant" -i inventory playbooks/prerequisites.yml
ansible-playbook -i inventory playbooks/deploy_cluster.yml
I tried and I had it working (maybe not as expected :) can you check that you don't have any node
with Manager : runc
in metadata ?
Hi @safchain
oc get node -o yaml | grep Manager
I don't get anything.
What you mean about "maybe not as expected" I think It will be the same as with docker from the UI perspective. Because right now I just get all the veth on the host namespace connected to the ovs bridge but not veth in the container namespace.
Hi @SchSeba could you share a screenshot ? I mean checking that there is no Skydive node(on WebUI) with Manager: runc
metadata.
@safchain sorry I maybe don't understand you.
here is the metadata from the WebUI
CPU :
Hostname : node1.example.com
KernelVersion : 3.10.0-957.el7.x86_64
Name : node1.example.com
OS : linux
Platform : ubuntu
PlatformFamily : debian
PlatformVersion : 18.04
TID : 420f69f0-573b-5bc1-7435-6d180be6a48b
Type : host
VirtualizationRole : host
VirtualizationSystem : kvm
One thing this server is not ubuntu
NAME="Red Hat Enterprise Linux Server"
VERSION="7.6 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.6"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.6 (Maipo)"
Please tell me if you need something else.
@SchSeba Here what I got with the inventory you gave me. I'm able to see container/namespace + veth. Can you check the version of Skydive (header bar on the WebUI)
@SchSeba With the ovsdb probe, that is better :)
@safchain Amazing!
but this is what I see
Can you maybe share with me your yaml's and deployment instruction of the skydive? I also saw that the version is not exactly the same this can be the problem?
@SchSeba I didn't use a yaml file I just deployed skydive by hand. I tried with the yaml file you provided but got permission issues(privileged container). I think that the runc
probe is not properly set in the config file/yaml. I'll try to fix my permission issues. There is no version issue, you just need a >= 0.21 version
@safchain can you try to deploy it with https://github.com/skydive-project/skydive/tree/master/contrib/openshift this is what I used.
@SchSeba Yes I tried but facing the "Privileged containers are not allowed" issue
@SchSeba I managed to get it working. I used the following template
apiVersion: v1
kind: Template
metadata:
name: skydive
objects:
- apiVersion: v1
kind: ConfigMap
metadata:
labels:
app: skydive-analyzer
name: skydive-analyzer-config
data:
SKYDIVE_ANALYZER_FLOW_BACKEND: elasticsearch
SKYDIVE_ANALYZER_TOPOLOGY_BACKEND: elasticsearch
SKYDIVE_ANALYZER_TOPOLOGY_PROBES: ""
SKYDIVE_ETCD_LISTEN: 0.0.0.0:12379
- apiVersion: v1
kind: ConfigMap
metadata:
labels:
app: skydive-agent
name: skydive-agent-config
- apiVersion: v1
kind: Service
metadata:
labels:
app: skydive-analyzer
name: skydive-analyzer
spec:
ports:
- name: api
port: 8082
protocol: TCP
targetPort: 8082
- name: protobuf
port: 8082
protocol: UDP
targetPort: 8082
- name: etcd
port: 12379
protocol: TCP
targetPort: 12379
- name: etcd-cluster
port: 12380
protocol: TCP
targetPort: 12380
- name: es
port: 9200
protocol: TCP
targetPort: 9200
selector:
app: skydive
tier: analyzer
sessionAffinity: None
type: NodePort
- apiVersion: v1
kind: DeploymentConfig
metadata:
name: skydive-analyzer
spec:
replicas: 1
selector:
app: skydive
tier: analyzer
strategy:
rollingParams:
intervalSeconds: 1
maxSurge: 25%
maxUnavailable: 25%
timeoutSeconds: 600
updatePeriodSeconds: 1
type: Rolling
template:
metadata:
labels:
app: skydive
tier: analyzer
spec:
containers:
- args:
- analyzer
- --listen=0.0.0.0:8082
envFrom:
- configMapRef:
name: skydive-analyzer-config
image: skydive/skydive
imagePullPolicy: Always
livenessProbe:
failureThreshold: 3
tcpSocket:
port: 8082
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
name: skydive-analyzer
ports:
- containerPort: 8082
protocol: TCP
- containerPort: 8082
protocol: UDP
- containerPort: 12379
protocol: TCP
- containerPort: 12380
protocol: TCP
readinessProbe:
failureThreshold: 1
tcpSocket:
port: 8082
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
- image: elasticsearch:5
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
tcpSocket:
port: 9200
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
name: skydive-elasticsearch
ports:
- containerPort: 9200
protocol: TCP
readinessProbe:
failureThreshold: 1
tcpSocket:
port: 9200
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
securityContext:
privileged: true
dnsPolicy: ClusterFirst
restartPolicy: Always
terminationGracePeriodSeconds: 30
test: false
triggers:
- type: ConfigChange
- apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
labels:
app: skydive
tier: agent
name: skydive-agent
spec:
selector:
matchLabels:
app: skydive
tier: agent
template:
metadata:
labels:
app: skydive
tier: agent
spec:
containers:
- args:
- agent
env:
- name: SKYDIVE_ANALYZERS
value: $(SKYDIVE_ANALYZER_SERVICE_HOST):$(SKYDIVE_ANALYZER_SERVICE_PORT_API)
- name: SKYDIVE_AGENT_TOPOLOGY_PROBES
value: "ovsdb runc"
envFrom:
- configMapRef:
name: skydive-agent-config
image: skydive/skydive
imagePullPolicy: Always
name: skydive-agent
ports:
- containerPort: 8081
hostPort: 8081
protocol: TCP
securityContext:
privileged: true
volumeMounts:
- mountPath: /var/run/docker.sock
name: docker
- mountPath: /host/run
name: run
- mountPath: /run/runc
name: runc
- mountPath: /var/run/openvswitch/db.sock
name: ovsdb
dnsPolicy: ClusterFirst
hostNetwork: true
hostPID: true
restartPolicy: Always
terminationGracePeriodSeconds: 30
volumes:
- hostPath:
path: /var/run/docker.sock
name: docker
- hostPath:
path: /var/run/netns
name: run
- hostPath:
path: /run/runc
name: runc
- hostPath:
path: /var/run/openvswitch/db.sock
name: ovsdb
- apiVersion: v1
kind: Route
metadata:
labels:
app: skydive-analyzer
name: skydive-analyzer
spec:
port:
targetPort: api
to:
kind: Service
name: skydive-analyzer
weight: 100
wildcardPolicy: None
Basically adding runc and adding runc
folder mount.
I got the following topology:
Hi @safchain thanks for the yaml. I will try to deploy it on my environment and update the issue.
If this works I can create a PR with the runc and docker for the k8s and openshift deployments? Or you prefer to leave them with docker as default? It can be a problem if we enable both docker and runc proves and only one of them exist?
@SchSeba yes you can create a PR for k8s and openshift deployments. I don't think there is a problem activating both probes.
Hi @safchain, It still didn't work for me.
I got into the code a bit and saw the folder you try to read from /run/runc
this folder is empty in my deployment.
this is the version of cri-o I used
cri-o.x86_64 1.11.10-1.rhaos3.11.git42c86f0.el7
cri-tools.x86_64 1.11.1-1.rhaos3.11.gitedabfb5.el7
criu.x86_64 3.9-5.el7 @rhel-7.6-base
I can see data in this folder
ll /run/runc-ctrs/
total 0
drwx--x--x. 2 root root 60 Dec 23 22:49 011410a535564591290dc09900f112627fcdbca32b6255a9c198cd1aea29e197
drwx--x--x. 2 root root 60 Dec 23 22:37 02fc7b80b6f699820c8aa35cbae6a604e42a66ed4be6c4c7af7b5245512cfe31
and inside of any folder there
ll /run/runc-ctrs/011410a535564591290dc09900f112627fcdbca32b6255a9c198cd1aea29e197
total 24
-rw-r--r--. 1 root root 22108 Dec 23 22:49 state.json
If I change the volume to /run/runc-ctrs
and restart the agents I got an error
2018-12-24T11:12:10.833Z ERROR runc/runc.go:171 getMetadata cnv-executor-yadu-node2.example.com: Unable to read create config /run/containers/storage/overlay-containers/f1ce8d5743d2fec95166eaae75b8deac0409d37fe58562b29fecd4e1e882fdb5/userdata/artifacts/create-config: open /run/containers/storage/overlay-containers/f1ce8d5743d2fec95166eaae75b8deac0409d37fe58562b29fecd4e1e882fdb5/userdata/artifacts/create-config: no such file or directory
but I am able to see the runc pods now! The error is fine?
please tell me if you need any other information
@SchSeba great you had it working. You can ignore this error there is already a patch under review to ignore it https://github.com/skydive-project/skydive/pull/1526/files#diff-e3e94608aea11e3eb72d370bfdc97bd1R131
Thanks for the comment @safchain!
I just have a question what engine you use? how it can be that you mount the /run/runc
and I needed to mount the /run/runc-ctrs
?
Now I am not sure what volume to configure in the openshift deployment PR I am going to open for this repo.
We don't use any engine. We use the code that was already there in Skydive. we are just parsing the runc files according to the specs. I think the issue is due the openshift/crio/distribution. I used what you suggested to deploy it but it was on a Fedora 29.
I would suggest to mount both folder and to specify them in the config file here: https://github.com/skydive-project/skydive/blob/master/etc/skydive.yml.default#L229
@safchain fine and what about the openshift and k8s deployment ? there is a way to configure multiple folders for one probe? maybe add a configmap into the agent pod?
The section in the config file I pointed is a list so you can specify multiple folders for the runc probe. If the folder doesn't exist or not used it won't be a problem.
run_path:
- /run/runc
- /run/runc-crts
Hi @safchain just a quick update this configuration doesn't work for me.
run_path:
- /var/run/runc
- /var/run/runc-crts
The agent skill look into /run/runc/
folder only.
I just tested
here the config I used as example
agent:
topology:
probes:
- ovsdb
- runc
runc:
run_path:
- /tmp/toto
- /tmp/titi
and I got the following lines in the logs (DEBUG level)
2018-12-27T17:06:19.601+0100 DEBUG runc/runc.go:296 (*Probe).initialize.func1 pc12.home: Probe initialized for /tmp/titi
2018-12-27T17:06:19.601+0100 DEBUG runc/runc.go:296 (*Probe).initialize.func1 pc12.home: Probe initialized for /tmp/toto
Can you check in the logs that the modification that you did in the config file is correctly read ?
Thanks for the answer @safchain How I can enable the debug level in the logs?
in the config file
https://github.com/skydive-project/skydive/blob/master/etc/skydive.yml.default#L347
logging:
level: DEBUG
Hi, I have the same problem with my skydive installation on OpenShift: Missing CRIO informations:
Versions: Skydive Agent 0.21.0-a772a0989b39 openshift v3.11.51 kubernetes v1.11.0+d4cacc0
Logs from the agent:
[root@master0 ~]# oc logs skydive-agent-5gr5h
2019-01-03T11:47:09.671Z INFO agent/agent.go:46 glob..func1 master0: Skydive Agent 0.21.0-a772a0989b39 starting...
2019-01-03T11:47:09.672Z INFO http/server.go:109 (*Server).Listen master0: Listening on socket 127.0.0.1:8081
2019-01-03T11:47:09.674Z DEBUG websocket/pool.go:101 (*Pool).AddClient master0: AddClient for pool AnalyzerClientPool type : [*websocket.Pool]
2019-01-03T11:47:09.674Z INFO agent/probes.go:49 NewTopologyProbeBundleFromConfig master0: Topology probes: [ovsdb runc]
2019-01-03T11:47:09.675Z INFO probes/probes.go:67 NewFlowProbeBundle master0: Flow probes: [pcapsocket ovssflow sflow gopacket dpdk ebpf ovsmirror]
2019-01-03T11:47:09.675Z INFO probes/probes.go:117 NewFlowProbeBundle master0: Not compiled with dpdk support, skipping it
2019-01-03T11:47:09.723Z DEBUG probes/ebpf.go:444 loadModule master0: eBPF kernel stacktrace:
2019-01-03T11:47:09.726Z ERROR probes/probes.go:115 NewFlowProbeBundle master0: Failed to create ebpf probe: Unable to load eBPF elf binary (host amd64) from bindata: error while loading "socket_flow_table" (invalid argument)
2019-01-03T11:47:09.726Z DEBUG netns/netns.go:298 (*Probe).start master0: Probe initialized
2019-01-03T11:47:09.729Z INFO probes/ovsmirror.go:427 (*OvsMirrorProbesHandler).cleanupOvsMirrors master0: OvsMirror cleanup previous mirrors
2019-01-03T11:47:09.732Z INFO ovs/ovsdb.go:311 (*OvsMonitor).portAdded master0: New port "tun0(d907162f-1ca3-4def-8a45-79aa51bd2498)" added
2019-01-03T11:47:09.732Z INFO ovs/ovsdb.go:311 (*OvsMonitor).portAdded master0: New port "vxlan0(1883603a-3386-4617-bd6e-2dc58f1cbefc)" added
2019-01-03T11:47:09.732Z INFO ovs/ovsdb.go:311 (*OvsMonitor).portAdded master0: New port "veth91980545(19207d1b-1b86-4cf6-bfc4-44c21416c24d)" added
2019-01-03T11:47:09.732Z INFO ovs/ovsdb.go:311 (*OvsMonitor).portAdded master0: New port "veth3b5136b6(0a73a714-f807-4f6d-9950-2b5d22947078)" added
2019-01-03T11:47:09.732Z INFO ovs/ovsdb.go:311 (*OvsMonitor).portAdded master0: New port "vethf7099743(d021a224-1fe4-4c49-bea1-1b22296f45ed)" added
2019-01-03T11:47:09.732Z INFO ovs/ovsdb.go:311 (*OvsMonitor).portAdded master0: New port "br0(863e324e-e14d-49c3-868d-39f0b1568276)" added
...
2019-01-03T11:47:09.753Z DEBUG runc/runc.go:318 (*Probe).initialize.func1 master0: Probe initialized for /var/run/runc
...
No more informations about runc in the logs :-(
Agent config:
apiVersion: v1
data:
skydive.yml: |-
agent:
topology:
probes:
- ovsdb
- runc
runc:
run_path:
- /var/run/runc
analyzer:
listen: 0.0.0.0:8082
logging:
level: DEBUG
kind: ConfigMap
metadata:
annotations:
openshift.io/generated-by: OpenShiftNewApp
creationTimestamp: 2019-01-03T11:01:57Z
labels:
app: skydive-agent
name: skydive-agent-config
namespace: skydive
resourceVersion: "22679"
selfLink: /api/v1/namespaces/skydive/configmaps/skydive-agent-config
uid: fd112cad-0f46-11e9-aa36-fa163e0559f6
Directories inside the Agent POD
# ls -al /var/run/runc/
total 0
drwxr-xr-x. 3 root root 700 Jan 3 11:47 .
drwxr-xr-x. 1 root root 89 Jan 3 11:47 ..
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 0a1684351eaf86417b128f18c586151db29363d19bb5b3ffa066be937ed81a0e -> /var/run/containers/storage/overlay-containers/0a1684351eaf86417b128f18c586151db29363d19bb5b3ffa066be937ed81a0e/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 2c64933f5f7732803edd132e52b0b8a5af96430f1cbea2809b48b614110e5fd0 -> /var/run/containers/storage/overlay-containers/2c64933f5f7732803edd132e52b0b8a5af96430f1cbea2809b48b614110e5fd0/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:43 30edca31054ee15474ba2369f8aa991453eb3c3a7bc938d9c97f6fa0886f4bda -> /var/run/containers/storage/overlay-containers/30edca31054ee15474ba2369f8aa991453eb3c3a7bc938d9c97f6fa0886f4bda/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 11:47 3cadaf294601a044283938b26fcef5daf9e616025fbcd13ad6f8e9b21de5502b -> /var/run/containers/storage/overlay-containers/3cadaf294601a044283938b26fcef5daf9e616025fbcd13ad6f8e9b21de5502b/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 4aad111c54f4d6c4691b49220e5ba548f9dbea4298f2a9fd5f16af6577a92cdb -> /var/run/containers/storage/overlay-containers/4aad111c54f4d6c4691b49220e5ba548f9dbea4298f2a9fd5f16af6577a92cdb/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 4ea5a7187ed325a88e7cccb604f5fc0296041bb96a5968415985b7e51e7d89ca -> /var/run/containers/storage/overlay-containers/4ea5a7187ed325a88e7cccb604f5fc0296041bb96a5968415985b7e51e7d89ca/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 5c8a1abde4859801b3b5d758316a9438f8af94579ce0b756f1d4631ffc5d3bde -> /var/run/containers/storage/overlay-containers/5c8a1abde4859801b3b5d758316a9438f8af94579ce0b756f1d4631ffc5d3bde/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 5e17190cc24b8762314faaa05a3132b08d7b31833bedd354698a1fb54b7615d8 -> /var/run/containers/storage/overlay-containers/5e17190cc24b8762314faaa05a3132b08d7b31833bedd354698a1fb54b7615d8/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 5f725d453fa5db705bf634bac4c582e05e8357b84b407bc6f48d8bfdb77969c5 -> /var/run/containers/storage/overlay-containers/5f725d453fa5db705bf634bac4c582e05e8357b84b407bc6f48d8bfdb77969c5/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 60965cab194451bab5b30a725455e818906e619c5ae8a83de9b53aacfb1b10e2 -> /var/run/containers/storage/overlay-containers/60965cab194451bab5b30a725455e818906e619c5ae8a83de9b53aacfb1b10e2/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 756f68c52733150f37edaa292a2e828a9ca97da563118ada1a35f7f37a631f67 -> /var/run/containers/storage/overlay-containers/756f68c52733150f37edaa292a2e828a9ca97da563118ada1a35f7f37a631f67/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 7a2d96c32159267740598fbedbd68585fee948deb8440e093598ae0f35cb2752 -> /var/run/containers/storage/overlay-containers/7a2d96c32159267740598fbedbd68585fee948deb8440e093598ae0f35cb2752/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 8248250bcaa3cc02851aa5d0c9b755c059413833d6de58eb7b0f2d8608ec8a8f -> /var/run/containers/storage/overlay-containers/8248250bcaa3cc02851aa5d0c9b755c059413833d6de58eb7b0f2d8608ec8a8f/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 86657c5898fceaf8081f0ae2b379fcb34bfd2caf200d6744f856973bbed931bd -> /var/run/containers/storage/overlay-containers/86657c5898fceaf8081f0ae2b379fcb34bfd2caf200d6744f856973bbed931bd/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:43 887ae60fafbdd23746bb554048c92c5f33fc81b5156e647d1d797656c8062a75 -> /var/run/containers/storage/overlay-containers/887ae60fafbdd23746bb554048c92c5f33fc81b5156e647d1d797656c8062a75/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 8f3c62fd97c8a1e654083ecac3f6eaa633d2baa81ff2aa1f3c9fa86a2eb6a908 -> /var/run/containers/storage/overlay-containers/8f3c62fd97c8a1e654083ecac3f6eaa633d2baa81ff2aa1f3c9fa86a2eb6a908/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:45 9a2ed3586cdbfea4ed54cdc0b1b059dd9ad45898497a363ce90fee8dc157edb7 -> /var/run/containers/storage/overlay-containers/9a2ed3586cdbfea4ed54cdc0b1b059dd9ad45898497a363ce90fee8dc157edb7/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 a8cf7f04a85c1c52fdc8dc6f4dbe08911a3efcd1522d5a868d9eb374bb240f5d -> /var/run/containers/storage/overlay-containers/a8cf7f04a85c1c52fdc8dc6f4dbe08911a3efcd1522d5a868d9eb374bb240f5d/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 b1da0dd85ac28ab3c6d29618efbe7a2060d35f6e47fd5a443c38945fb367ca4b -> /var/run/containers/storage/overlay-containers/b1da0dd85ac28ab3c6d29618efbe7a2060d35f6e47fd5a443c38945fb367ca4b/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:43 b1e4b463bc480ab501ac71b55225f57e3164c0a15afd870dd56be4358b490de0 -> /var/run/containers/storage/overlay-containers/b1e4b463bc480ab501ac71b55225f57e3164c0a15afd870dd56be4358b490de0/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 c07a3148ee5d90cde0edfa85c27f373b8bd6f29d31a0fb5021575e5aa96a9dd7 -> /var/run/containers/storage/overlay-containers/c07a3148ee5d90cde0edfa85c27f373b8bd6f29d31a0fb5021575e5aa96a9dd7/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 c1c120cd8a6d1bcbca426e4c585653f98fcbb96cf5b9d6b93c929b56132bb404 -> /var/run/containers/storage/overlay-containers/c1c120cd8a6d1bcbca426e4c585653f98fcbb96cf5b9d6b93c929b56132bb404/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:43 c56ae36808ecaa9d9876b7bc34a1bbf7e0b1df246263313de20ea0cbd9582409 -> /var/run/containers/storage/overlay-containers/c56ae36808ecaa9d9876b7bc34a1bbf7e0b1df246263313de20ea0cbd9582409/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 c7a05e3b91678565e476f116e31becb98e40ea1293ab73af5540bd207d4b0b8e -> /var/run/containers/storage/overlay-containers/c7a05e3b91678565e476f116e31becb98e40ea1293ab73af5540bd207d4b0b8e/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:43 ca2611c057f2a3a28bab3ca3caed7b6aa66093de4252a6a65c48880f4847b0c1 -> /var/run/containers/storage/overlay-containers/ca2611c057f2a3a28bab3ca3caed7b6aa66093de4252a6a65c48880f4847b0c1/userdata
srwxr-xr-x. 1 root root 0 Jan 3 11:28 crio.sock
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 d16e72d6a6698e926c1b37589014e7322677f6d2db5c21ac419250c9475d236a -> /var/run/containers/storage/overlay-containers/d16e72d6a6698e926c1b37589014e7322677f6d2db5c21ac419250c9475d236a/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 d2da9e37ce0e150165f94acd36e5ecfadac775c8ff5e61a1159aca4718e8c3d6 -> /var/run/containers/storage/overlay-containers/d2da9e37ce0e150165f94acd36e5ecfadac775c8ff5e61a1159aca4718e8c3d6/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 d359ffb602c7997f9a1dca1a373347c88a62ffa8474c41bb0c6c301637251cd4 -> /var/run/containers/storage/overlay-containers/d359ffb602c7997f9a1dca1a373347c88a62ffa8474c41bb0c6c301637251cd4/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 11:47 ece70ddf6113111b7d0b4bd3b2507902ada78b2998ddf874d5fce539a5809c00 -> /var/run/containers/storage/overlay-containers/ece70ddf6113111b7d0b4bd3b2507902ada78b2998ddf874d5fce539a5809c00/userdata
drwxr-xr-x. 2 root root 380 Jan 3 11:46 exits
lrwxrwxrwx. 1 root root 120 Jan 3 10:43 f60d10b0d2c2f9c5719135a872133439efd241ed52ce0bb4eb8813b487cb96ed -> /var/run/containers/storage/overlay-containers/f60d10b0d2c2f9c5719135a872133439efd241ed52ce0bb4eb8813b487cb96ed/userdata
lrwxrwxrwx. 1 root root 120 Jan 3 10:44 fc2d952c006b215d8cec67f80b34ef13d3bfcb231aea379262a51cd8386de85a -> /var/run/containers/storage/overlay-containers/fc2d952c006b215d8cec67f80b34ef13d3bfcb231aea379262a51cd8386de85a/userdata
# ls -la /var/run/runc/0a1684351eaf86417b128f18c586151db29363d19bb5b3ffa066be937ed81a0e/
total 28
drwx------. 3 root root 180 Jan 3 10:44 .
drwx------. 3 root root 60 Jan 3 10:44 ..
srwx------. 1 root root 0 Jan 3 10:44 attach
-rw-r--r--. 1 root root 13546 Jan 3 10:44 config.json
prw-r--r--. 1 root root 0 Jan 3 10:44 ctl
-rw-r--r--. 1 root root 8 Jan 3 10:44 hostname
-rw-r--r--. 1 root root 5 Jan 3 10:44 pidfile
-rw-r--r--. 1 root root 61 Jan 3 10:44 resolv.conf
drwxrwxrwt. 2 root root 40 Jan 3 10:44 shm
#
DaemonSet
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
annotations:
openshift.io/generated-by: OpenShiftNewApp
creationTimestamp: null
generation: 5
labels:
app: skydive
tier: agent
name: skydive-agent
spec:
revisionHistoryLimit: 10
selector:
matchLabels:
app: skydive
tier: agent
template:
metadata:
creationTimestamp: null
labels:
app: skydive
tier: agent
spec:
containers:
- args:
- agent
env:
- name: SKYDIVE_ANALYZERS
value: $(SKYDIVE_ANALYZER_SERVICE_HOST):$(SKYDIVE_ANALYZER_SERVICE_PORT_API)
envFrom:
- configMapRef:
name: skydive-agent-config
image: skydive/skydive
imagePullPolicy: Always
name: skydive-agent
ports:
- containerPort: 8081
hostPort: 8081
protocol: TCP
resources: {}
securityContext:
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/docker.sock
name: docker
- mountPath: /host/run
name: run
- mountPath: /var/run/runc
name: crio
- mountPath: /var/run/openvswitch/db.sock
name: ovsdb
- mountPath: /etc/skydive.yml
name: agent-config
subPath: skydive.yml
- mountPath: /var/run/containers
name: containers
dnsPolicy: ClusterFirst
hostNetwork: true
hostPID: true
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- hostPath:
path: /var/run/docker.sock
type: ""
name: docker
- hostPath:
path: /var/run/containers/
type: ""
name: containers
- hostPath:
path: /var/run/crio/
type: ""
name: crio
- hostPath:
path: /var/run/netns
type: ""
name: run
- hostPath:
path: /var/run/openvswitch/db.sock
type: ""
name: ovsdb
- configMap:
defaultMode: 420
name: skydive-agent-config
name: agent-config
templateGeneration: 5
updateStrategy:
type: OnDelete
status:
currentNumberScheduled: 4
desiredNumberScheduled: 4
numberAvailable: 4
numberMisscheduled: 0
numberReady: 4
observedGeneration: 5
updatedNumberScheduled: 4
It looks like the agent don't check the /var/run/runc/ directory in detail...
@rbo what distribution are you using ?
[root@master0 ~]# cat /etc/os-release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.6 (Maipo)"
....
[root@master0 ~]# uname -a
Linux master0 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 15 17:36:42 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
@safchain if it helps, from Red Hatter to Red Hatter I can get you access to my lab.
@rbo Thanks that would be great to have the access if needed.
Can you just test something before, it looks like the /var/run/runc
folder is not the folder where runc keeps the states. Can you try with the following folders /run/runc-crts
or maybe /run/runc
. We merged recently a commit that add them by default:
https://github.com/skydive-project/skydive/commit/9b9dfdbbb551159f3122e10930afd07eb74c92a2
but not present in the 0.21.
Thanks
It works, thank you very much.
DaemonSet
[snipped]
- mountPath: /var/run/runc
name: crio
[snipped]
- hostPath:
path: /run/runc-ctrs/
type: ""
name: crio
[snipped]
Logs
[snipped]
2019-01-03T13:43:10.180Z DEBUG runc/runc.go:232 (*Probe).registerContainer master0: Register runc container 0268dc4e5ef7f324ad81d8956f3c78b03524bfefd1f9ee2ba3322b77e4cc55be and PID 114893
2019-01-03T13:43:10.181Z DEBUG runc/runc.go:232 (*Probe).registerContainer master0: Register runc container 0a1684351eaf86417b128f18c586151db29363d19bb5b3ffa066be937ed81a0e and PID 14703
[snipped]
I created an pull request with some changes to improve the installation on OpenShift : #1564
@rbo @SchSeba do you think that this issue can be closed as it was to introduce the support of cri-o ? if we discover issue or we want to improve it we will open new issue ?
From my point of view: yes - OpenShift works very well with my pull request. I don't know plain k8s.
Yes I think we can close this issue now thanks for the help!
Thanks @SchSeba @rbo for helping us to add Cri-o/OpenShift support !
Welcome, ping me if you need further help!
Hello @safchain,
I will like to know if the project intend to add support for cri-o engine and not only docker.