Open hyson007 opened 4 years ago
update self.command in class CEOS to the following resolved the issue to me.
['/sbin/init', 'systemd.setenv=INTFTYPE=eth', 'systemd.setenv=ETBA=1', 'systemd.setenv=SKIP_ZEROTOUCH_BARRIER_IN_SYSDBINIT=1', 'systemd.setenv=CEOS=1', 'systemd.setenv=EOS_PLATFORM=ceoslab', 'systemd.setenv=container=docker']
sounds right. do you want to do a pull request?
it seems there are still some issue with dynamic routing protocol, unable to bring up ospf, i'm working with arista TAC in case 199976 , will update here once I have more info.
sw-1(config-router-ospf)#end sw-1#sh ip os nei
% Internal error % To see the details of this error, run the command 'show error 1' sw-1#sh ip os nei ! OSPF inactive sw-1#sh ip os nei ! OSPF inactive sw-1#sh ip os nei
% Internal error % To see the details of this error, run the command 'show error 2'
i think this is because you need to have at least one ethernet interface in up/up state.
nope, i do have L3 interface up/up and can ping each other but ospf can't be brought up, show logging says rib is continuously crashing. TAC is able to reproduce the issue and ospf works when they use svi, they claim it's this bug causing issue,
BUG397410 affects all EOS versions. Kernel interfaces are the interfaces on which the VMs are installed.
Our Engineering team is working on this bug fix. As of now the work around would be to create an SVI, and have Ospf neighborship on a SVI instead of an ethernet interface.
but it seems no such issue on the old version, 4.20.5F, i will update once i have more info.
( i did encounter the scenario you mentioned when no ethernet interface in ceos is showing up, in that case i can't even enable 'ip routing', but this time it seems different, i can enable 'ip routing' at least )
I updated the self.command, but still getting the issue for version 4.22.1F
kubectl exec -it arista01-5f4dcbdf77-99h9x Cli Defaulting container name to router. Use 'kubectl describe pod/arista01-5f4dcbdf77-99h9x -n default' to see all of the containers in this pod. OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "exec: \"Cli\": executable file not found in $PATH": unknown command terminated with exit code 126
you can check that command over here https://github.com/networkop/docker-topo/blob/master/bin/docker-topo#L416
@networkop - Still getting this issue while running it in a k8s cluster. Have no issues when I launch it as separate docker container. I tried different arista ceos images and all have prb when launched in K8s cluster. I could get to the bash but not Cli. I did "ps -ef" to check all processes running after logging in to bash but see no process running. But in the one I launched as separate docker container, I could see all the processes running.
bash-4.3# ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 10:38 ? 00:00:00 /sbin/init systemd.setenv=INTFTYPE=eth systemd.setenv=ETBA=1 systemd.setenv=SKIP_ZEROTOUCH_BARRIER_IN_S root 6 0 0 10:39 pts/0 00:00:00 bash root 14 6 0 10:40 pts/0 00:00:00 ps -ef
kubectl describe pod arista05-bb8dcbf6b-mkn7m
Name: arista05-bb8dcbf6b-mkn7m
Namespace: default
Priority: 0
Node: k8s-agentpool-24376997-vmss000004/10.240.0.125
Start Time: Thu, 02 Apr 2020 03:38:06 -0700
Labels: app=aristatopo03
device=arista05
pod-template-hash=bb8dcbf6b
Annotations: kubernetes.io/psp: privileged
Status: Running
IP: 10.240.0.127
IPs:
IP: 10.240.0.127
Controlled By: ReplicaSet/arista05-bb8dcbf6b
Containers:
router:
Container ID: docker://285f718f4a04add8a9ce74fce60ad2ebea26081eaede75578ce5e9dd24603b82
Image: ccevirtnetpperegistry.azurecr.io/ceosimage:4.21.10M
Image ID: docker-pullable://ccevirtnetpperegistry.azurecr.io/ceosimage@sha256:9c1867f3e5f2e539f2a521f4ba443906ec2e6b5972cb6dc1b1e6faa902efe977
Port:
Warning FailedScheduling
I can't see where the error is. @vparames86 can you try launching it as a standalone pod, i.e. outside of k8s-topo?
@networkop - Even the standalone pod doesn't seem to work for me. This is the yaml I used. I put all the vars in COMMANDS and also tried putting the remaining ones other than /sbin/init under ARGS but doesn't seem to work.
apiVersion: v1 kind: Pod metadata: name: arista101 namespace: default spec: containers:
Could you please share a pod.yaml that works for you?
this one worked for me
apiVersion: v1
kind: Pod
metadata:
name: ceos
spec:
containers:
- image: ceos:4.23.2F
name: ceos
securityContext:
privileged: true
capabilities:
add:
- NET_ADMIN
command:
- "/sbin/init"
args:
- "systemd.setenv=INTFTYPE=eth"
- "systemd.setenv=ETBA=1"
- "systemd.setenv=SKIP_ZEROTOUCH_BARRIER_IN_SYSDBINIT=1"
- "systemd.setenv=CEOS=1"
- "systemd.setenv=container=docker"
- "systemd.setenv=EOS_PLATFORM=ceoslab"
env:
- name: CEOS
value: "1"
- name: EOS_PLATFORM
value: "ceoslab"
- name: container
value: docker
- name: SKIP_ZEROTOUCH_BARRIER_IN_SYSDBINIT
value: "1"
- name: INTFTYPE
value: eth
ah sec_context = client.V1SecurityContext(privileged=True) this is missing for create_nsm function. This most probably might be the issue.
@networkop - This worked thanks for your help
getting below error when start pod
OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "exec: \"Cli\": executable file not found in $PATH": unknown
from arista recent readme of ceos-lab, it seems there is need to pass system some systemd.setenv arg along with /sbin/init but looking at Class CEOS, it seems only environment variables are passed. (I tried to concat in self.command but it doesn't work)create docker instances with needed environment variables docker create --name=ceos1 --privileged -e INTFTYPE=eth -e ETBA=1 -e SKIP_ZEROTOUCH_BARRIER_IN_SYSDBINIT=1 -e CEOS=1 -e EOS_PLATFORM=ceoslab -e container=docker -i -t ceosimage:4.21.0F /sbin/init systemd.setenv=INTFTYPE=eth systemd.setenv=ETBA=1 systemd.setenv=SKIP_ZEROTOUCH_BARRIER_IN_SYSDBINIT=1 systemd.setenv=CEOS=1 systemd.setenv=EOS_PLATFORM=ceoslab systemd.setenv=container=docker