Open Mierdin opened 5 years ago
The pings in stage 5 are also not working.
There is some problem in the vQFX image, although we can ping the PFE:
antidote@vqfx> ping 169.254.0.1
PING 169.254.0.1 (169.254.0.1): 56 data bytes
64 bytes from 169.254.0.1: icmp_seq=0 ttl=64 time=1.877 ms
64 bytes from 169.254.0.1: icmp_seq=1 ttl=64 time=2.022 ms
^C
--- 169.254.0.1 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 1.877/1.950/2.022/0.072 ms
But we can't see the PFE:
{master:0}antidote@vqfx> show chassis fpc pic-status
So we don't have any xe-0/0/* interfaces:
{master:0}antidote@vqfx> show interfaces terse xe*
As you can see the configuration, all the IP addresses are configured in xe-0/0/x, that's why we can't ping it.
Here is the output of my vqfx-full image:
{master:0}
root@vqfx> show chassis fpc pic-status
Slot 0 Online QFX10002-36Q
PIC 0 Online 48x 10G-SFP+
{master:0}
root@vqfx> show interfaces terse xe*
Interface Admin Link Proto Local Remote
xe-0/0/0 up up
xe-0/0/0.16386 up up
xe-0/0/1 up up
xe-0/0/1.16386 up up
Hmm. @mwiget may have thoughts. I'm signing off for now, will be back online tomorrow am.
On Wed, Mar 27, 2019, 12:49 AM Raymond Lam notifications@github.com wrote:
There is some problem in the vQFX image, although we can ping the PFE:
antidote@vqfx> ping 169.254.0.1 PING 169.254.0.1 (169.254.0.1): 56 data bytes 64 bytes from 169.254.0.1: icmp_seq=0 ttl=64 time=1.877 ms 64 bytes from 169.254.0.1: icmp_seq=1 ttl=64 time=2.022 ms ^C --- 169.254.0.1 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 1.877/1.950/2.022/0.072 ms
But we can't see the PFE:
{master:0}antidote@vqfx> show chassis fpc pic-status
So we don't have any xe-0/0/* interfaces:
{master:0}antidote@vqfx> show interfaces terse xe*
As you can see the configuration, all the IP addresses are configured in xe-0/0/x, that's why we can't ping it.
Here is the output of my vqfx-full image:
{master:0} root@vqfx> show chassis fpc pic-status Slot 0 Online QFX10002-36Q PIC 0 Online 48x 10G-SFP+
{master:0} root@vqfx> show interfaces terse xe* Interface Admin Link Proto Local Remote xe-0/0/0 up up xe-0/0/0.16386 up up xe-0/0/1 up up xe-0/0/1.16386 up up
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nre-learning/antidote/issues/212#issuecomment-477020967, or mute the thread https://github.com/notifications/unsubscribe-auth/AECM-4UT0KYjXKUm5ilOgA6_lCo3F5hKks5vayKhgaJpZM4cNIVj .
Further testing the container image, it boots up the Junos when the container starts, so it requires very long time to detect the PFE. After wait for about 10 minutes, I can see the PFE in show chassis fpc pic-status
output. Despite of that, I am still unable to ping the self IP in another VR. Checked the interface counters, there are output packets in xe-0/0/0 but no input packets in xe-0/0/1, maybe need to debug the tap interface in the hypervisor.
FYI, for my vqfx-full image, I use the previous approach, wait for the vqfx fully boot up with PFE online, and then do a snapshot. The container image just simply boots the snapshot image, so the PFE goes online once the container started.
Regarding >10 minutes, I wonder if kvm nested support is still not available. Cold booting 4 vQFX 18.4 on a i7-8700 takes less than 80 seconds, including LLDP neighbors over xe interfaces. As cosim runs natively in the container, it is just waiting for the connection from Junos and won’t add additional delays. I’ll need to bring up a sample lab myself on the cloud platform to get a sample tops going. So far I’ve done it locally only. Unfortunately I’m busy on other topics until Thu afternoon.
I'm a little confused as to why we're expecting traffic to leave xe-0/0/0 and re-enter xe-0/0/1? Wouldn't we need to bridge those together on the hypervisor to get that to work?
ah...my bad. I forgot in my vqfx-full image, I have created a linux bridge to connect xe-0/0/0 & xe-0/0/1, and another bridge to connect xe-0/0/2 & xe-0/0/3
any suggestion for this scenario.... or we should bring up 2 vQFX to testing the ping?
Re: HW virtualization, we're running in GCP with the appropriate flags enabled to ensure this is passed through to the instance. We can see this makes it all the way to the container/pod:
kubectl exec -it -n=25-e9zw2gzoq1ujhup0-ns vqfx /bin/bash
root@vqfx:/# grep --color vmx /proc/cpuinfo
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc eagerfpu pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch tpr_shadow flexpriority ept fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt arat arch_capabilities
(....repeats per core...)
Then we can see the --enable-kvm
flag is present in the qemu command:
root@vqfx:/# ps -aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 79.9 3.2 2762396 2011420 ? Ssl 17:56 15:50 qemu-system-x86_64 -M pc --enable-kvm -cpu host -smp 1 -m 2048 -no-user-config -no-shutdown -monitor tcp:0.0.0.0:4000,server,nowait -serial telnet
root 118 0.0 0.0 18364 1688 ? S 17:56 0:00 /bin/bash /root/pecosim/pecosim_autorun.sh
root 125 1.8 0.4 8544732 295240 ? S 17:56 0:21 /root/pecosim/pe_cosim -e -t inet -p 3000
root 195 0.0 0.0 18496 2024 ? Ss 18:11 0:00 /bin/bash
root 215 0.0 0.0 34388 1468 ? R+ 18:16 0:00 ps -aux
It's entirely possible that there's some other form of CPU contention going on - in fact, running top even after the vqfx has booted is showing 100% utilization for a while, presumably because the cosim is doing something in that time (since it doesn't show up for a few minutes even after that).
@jnpr-raylam You can do that, or even just another linux container will work, provided you set an ip address to the appropriate interface within the lesson guide.
I am more than likely going to keep the JET lesson and the OpenConfig lesson in PTR while we sort these images issues out. We have a lot of other stuff that needs to get published ASAP so we can move forward with those, and then once we get these images issues sorted, I will circle back and promote these lessons in a quick release once again.
What about using LT interface ?
Regards Tony Chan
On Mar 28, 2019, at 05:23, Matt Oswalt notifications@github.com wrote:
@jnpr-raylam You can do that, or even just another linux container will work, provided you set an ip address to the appropriate interface within the lesson guide.
I am more than likely going to keep the JET lesson and the OpenConfig lesson in PTR while we sort these images issues out. We have a lot of other stuff that needs to get published ASAP so we can move forward with those, and then once we get these images issues sorted, I will circle back and promote these lessons in a quick release once again.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
vQFX doesn't support tunnel interface
Looking at https://ptr.labs.networkreliability.engineering/labs/?lessonId=25&lessonStage=4, it seems the ping command is meant to ping a local IP address, but there are no responses to this yet.
Might be worth looking into
/cc @jnpr-raylam @valjeanchan