GNS3 / gns3-registry

GNS3 devices registry
GNU General Public License v3.0
330 stars 395 forks source link

Update Juniper VQFX #469

Closed grossmj closed 1 year ago

grossmj commented 5 years ago

It seems that vmxnet3 adapter type is required to correctly run in Qemu:

https://gns3.com/community/discussion/issue-with-juniper-vqfx-after-gn

grossmj commented 5 years ago

Some good information there too: https://gns3.com/community/discussion/vmx-doesn-t-work-with-gsn3-2-2

adosztal commented 5 years ago

I'll check it on my box.

adosztal commented 5 years ago

I found a Vagrantfile for vQFX in a Juniper repo, it uses 82540EM as NIC type, which is not available in GNS3. Is it possible to add this type?

adosztal commented 5 years ago

Never mind, I just learned that 82540EM is the e1000.

adosztal commented 5 years ago

I checked the latest vMX (19.3R1.8), it works with the existing configuration. Moving onto vSRX and vQFX.

root> show chassis fpc 0  
                     Temp  CPU Utilization (%)   CPU Utilization (%)  Memory    Utilization (%)
Slot State            (C)  Total  Interrupt      1min   5min   15min  DRAM (MB) Heap     Buffer
  0  Online           Testing   4         0        3      3      2    511        32          0

root> show chassis fpc pic-status 
Slot 0   Online       Virtual FPC                                   
  PIC 0  Online       Virtual

Note: the data interfaces are available too (I connected a VPCS and pinged it), I just forgot to copy those output here.

adosztal commented 4 years ago

vSRX: I tried the latest version here too, it works with vmxnet3 but not with e1000; I changed the NIC type in 3862020.

The interfaces are up and communicating:

root> show interfaces fxp0 terse 
Interface               Admin Link Proto    Local                 Remote
fxp0                    up    up
fxp0.0                  up    up   inet     10.0.0.1/24     

root> show interfaces ge-0/0/0 terse                                          
Interface               Admin Link Proto    Local                 Remote
ge-0/0/0                up    up
ge-0/0/0.0              up    up   inet     192.168.1.1/24  

root> ping 10.0.0.10 
PING 10.0.0.10 (10.0.0.10): 56 data bytes
64 bytes from 10.0.0.10: icmp_seq=0 ttl=64 time=1.944 ms
64 bytes from 10.0.0.10: icmp_seq=1 ttl=64 time=1.894 ms
^C
--- 10.0.0.10 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 1.894/1.919/1.944/0.025 ms

root> ping 192.168.1.10
PING 192.168.1.10 (192.168.1.10): 56 data bytes
64 bytes from 192.168.1.10: icmp_seq=0 ttl=64 time=0.060 ms
64 bytes from 192.168.1.10: icmp_seq=1 ttl=64 time=0.055 ms
^C
--- 192.168.1.10 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.055/0.057/0.060/0.003 ms

Note: ping must be enabled on the interfaces, otherwise it will drop both incoming and outgoing requests. Example command: set security zone security-zone trust interface ge-0/0/0 host-inbound-traffic system services ping

adosztal commented 4 years ago

vQFX: it's acting strange. I see the RE starting a TCP session to the PFE, then the PFE sends a TCP reset. This is looping infinitely. Screenshot from 2019-11-09 14-34-32

jimealbe commented 4 years ago

I checked the latest vMX (19.3R1.8), it works with the existing configuration. Moving onto vSRX and vQFX.

root> show chassis fpc 0  
                     Temp  CPU Utilization (%)   CPU Utilization (%)  Memory    Utilization (%)
Slot State            (C)  Total  Interrupt      1min   5min   15min  DRAM (MB) Heap     Buffer
  0  Online           Testing   4         0        3      3      2    511        32          0

root> show chassis fpc pic-status 
Slot 0   Online       Virtual FPC                                   
  PIC 0  Online       Virtual

Note: the data interfaces are available too (I connected a VPCS and pinged it), I just forgot to copy those output here.

Yeap I got a 17.x vMX working wihout changing anything, but the issue is with the re-release (14.x versions), for those all interfaces used to be e1000 prior GNS3 version 2.2.1, after that now firts 2 are fine as e1000, but the rest have to be changed to virtio-net-pci.

I also got the vSRX working fine, but cant seem to get vQFX working

adosztal commented 4 years ago

What's the reason for using the pre-release vMX versions? Our life would be much easier, at least with the vMX, if those were not supported as an appliance, only through some guide.

adosztal commented 4 years ago

The RE supports virtio, it is recognized during boot:

em0: <VirtIO Networking Adapter> on virtio_pci0
virtio_pci0: host features: 0x79bf8064 <EventIdx,RingIndirect,0x8000000,NotifyOnEmpty,0x800000,0x200000,RxModeExtra,VLanFilter,RxMode,ControlVq,Status,MrgRxBuf,TxAllGSO,MacAddress,0x4>
virtio_pci0: negotiated features: 0x110f8020 <RingIndirect,NotifyOnEmpty,VLanFilter,RxMode,ControlVq,Status,MrgRxBuf,MacAddress>

The issue will be with the PFE or with the communication with (link to) the PFE.

adosztal commented 4 years ago

I booted both with KVM but outside GNS3 (all NICs are in vmnet2, host adapter for VMware Workstation). As you can see, they're communicating well, the PFE (e1000) is seen in the RE (virtio) as FPC and the XE-0/0/x interfaces are present. Screenshot from 2019-11-09 16-53-13 Screenshot from 2019-11-09 16-53-11

Were there any changes in ubridge between the 2 versions?

adosztal commented 4 years ago

It turned out we just have to wait a bit (when using e1000 for the PFE and virtio for the RE). :)

FPC status right after the boot:

root> show chassis fpc 0
                     Temp  CPU Utilization (%)   CPU Utilization (%)  Memory    Utilization (%)
Slot State            (C)  Total  Interrupt      1min   5min   15min  DRAM (MB) Heap     Buffer
  0  Empty           

The following appears after ~2 minutes:

Message from syslogd@ at Nov  9 18:40:24  ...
 rpio_tunnel_br[1876]: SCHED: Thread 28 (cmqfx_pseudo) ran for 1166 ms without yielding

Message from syslogd@ at Nov  9 18:40:24  ...
 rpio_tunnel_br[1876]: Scheduler Oinker

Message from syslogd@ at Nov  9 18:40:24  ...
 rpio_tunnel_br[1876]: Frame 0: sp = 0x193cf6e0, pc = 0x8048b2a

Message from syslogd@ at Nov  9 18:40:24  ...
 rpio_tunnel_br[1876]: Frame 1: sp = 0x193cf6f0, pc = 0x80567df

Message from syslogd@ at Nov  9 18:40:24  ...
 rpio_tunnel_br[1876]: Frame 2: sp = 0x193cf7c0, pc = 0x8058548

Message from syslogd@ at Nov  9 18:40:24  ...
 rpio_tunnel_br[1876]: Frame 3: sp = 0x193cf7f0, pc = 0x89848ec

Message from syslogd@ at Nov  9 18:40:24  ...
 rpio_tunnel_br[1876]: Frame 4: sp = 0x193cf860, pc = 0x8997ed9

Message from syslogd@ at Nov  9 18:40:24  ...
 rpio_tunnel_br[1876]: Frame 5: sp = 0x193cf8b0, pc = 0x8da7e8e

Message from syslogd@ at Nov  9 18:40:24  ...
 rpio_tunnel_br[1876]: Frame 6: sp = 0x193cfb40, pc = 0x8d913b3

Message from syslogd@ at Nov  9 18:40:24  ...
 rpio_tunnel_br[1876]: Frame 7: sp = 0x193cfb60, pc = 0x8d93e88

Message from syslogd@ at Nov  9 18:40:24  ...
 rpio_tunnel_br[1876]: Frame 8: sp = 0x193cfbb0, pc = 0x8057ec0

Message from syslogd@ at Nov  9 18:40:24  ...
 rpio_tunnel_br[1876]: Frame 9: sp = 0x193cfbb8, pc = 0x0

The uptime after this is 00:02:01:

root> show system uptime
fpc0:
--------------------------------------------------------------------------
Current time: 2019-11-09 18:40:26 UTC
Time Source:  LOCAL CLOCK 
System booted: 2019-11-09 18:38:12 UTC (00:02:14 ago)
Protocols started: 2019-11-09 18:38:41 UTC (00:01:45 ago)
Last configured: 2019-11-09 18:38:38 UTC (00:01:48 ago) by root
 6:40PM  up 2 mins, 1 user, load averages: 0.84, 0.58, 0.25

The FPC is present:

root> show chassis fpc 0
                     Temp  CPU Utilization (%)   CPU Utilization (%)  Memory    Utilization (%)
Slot State            (C)  Total  Interrupt      1min   5min   15min  DRAM (MB) Heap     Buffer
  0  Online           Testing   0         0        0      0      0      0         0          0

{master:0}
root> show chassis fpc pic-status 
Slot 0   Online       QFX10002-36Q                                  
  PIC 0  Online       48x 10G-SFP+

At least it works with the latest image and using the appliance file from #477. I encourage everyone to upgrade.

adosztal commented 4 years ago

The XE interfaces are also visible:

root> show interfaces terse | match xe | except inet 
xe-0/0/0                up    up
xe-0/0/1                up    up
xe-0/0/2                up    up
xe-0/0/3                up    up
xe-0/0/4                up    up
xe-0/0/5                up    up
xe-0/0/6                up    up
xe-0/0/7                up    up
xe-0/0/8                up    up
xe-0/0/9                up    up
xe-0/0/10               up    up
xe-0/0/11               up    up

I suggest to close this, maybe after some further testing, because all 3 Juniper appliances are working - again, at least their latest versions.

jimealbe commented 4 years ago

What's the reason for using the pre-release vMX versions? Our life would be much easier, at least with the vMX, if those were not supported as an appliance, only through some guide.

The pre-realese uses way less reasorces compare to the full version, RE and PFE are part of the same image, it runs perfectly with only 512mb of ram, and can work fine with even less that than, it supports most of the basic/middle features even some advanced, so for people that doesnt have a powerfull computer to virtuaize or as introduction to Junos, the pre-release is one of the best options, only 500-700mb depending on the format (ove, img, qcow) it can be quickly and easily load on virtualbox, mwware or quemu.

I got 5 duffernt Junos appliances, olive, vmx pre-release, vsrx, vqfx and vmx full, and I can tell you that so far the one I have used the most is the pre-release. you can run several pre-release at the same time and have logical systems and routing instances to have really big topologies, but on a low/middle range PC you may only be able to have 2 vMX full, I see more advantages on pre-release thatn the ones I see on the full.

grossmj commented 4 years ago

Should we create 2 different appliance files? one for vmx pre-release or one for vmx full to make a clear separation.

jimealbe commented 4 years ago

The XE interfaces are also visible:

root> show interfaces terse | match xe | except inet 
xe-0/0/0                up    up
xe-0/0/1                up    up
xe-0/0/2                up    up
xe-0/0/3                up    up
xe-0/0/4                up    up
xe-0/0/5                up    up
xe-0/0/6                up    up
xe-0/0/7                up    up
xe-0/0/8                up    up
xe-0/0/9                up    up
xe-0/0/10               up    up
xe-0/0/11               up    up

I suggest to close this, maybe after some further testing, because all 3 Juniper appliances are working - again, at least their latest versions.

Thanks for looking into this and provide posible solution, can you confirm ping also works? I was playing around with interfaces type and at some point I got xe interfaces to show up, but ping was not working, I will be testing later on, thanks!

adosztal commented 4 years ago

Thanks for looking into this and provide posible solution, can you confirm ping also works? I was playing around with interfaces type and at some point I got xe interfaces to show up, but ping was not working, I will be testing later on, thanks!

It does, look at this thread in the forum.

adosztal commented 4 years ago

Should we create 2 different appliance files? one for vmx pre-release or one for vmx full to make a clear separation.

Yes, we should. There's just one problem though. The appliance schema currently does not support the per-interface driver. I'll create a separate issue for this.

jimealbe commented 4 years ago

Thanks for looking into this and provide posible solution, can you confirm ping also works? I was playing around with interfaces type and at some point I got xe interfaces to show up, but ping was not working, I will be testing later on, thanks!

It does, look at this thread in the forum.

Yeap I just confirmed that for the latest version as well as version 15.1X53-D60.4

adosztal commented 4 years ago

@jimealbe, could you share an image with me so I can create the appliance file? If that's not possible, I'll need a filename, its size in bytes and the md5sum.

adosztal commented 4 years ago

@jimealbe, can you provide the above?

adosztal commented 4 years ago

The existing appliances work, there's no info about the pre-release versions. I think we can close this.

sliddjur commented 4 years ago

Hi @adosztal, I have gotten the latest 18.4r2-s2 pfe and 18.4r2-s2.3-re from juniper support download site. I have similar issue, where after the first boot everything seemed fine. And on the second run I never get it working properly. It seem to be random if fpc starts up or not.

I have a server with 12 cores of Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz and 48G ram. I run the gns3vm on a esxi server. pfe is using e1000 and re is using virtio-net-pci Running gns3 2.2.7

{linecard:1}[edit]
root@vqfx# run show system uptime 
fpc1:
--------------------------------------------------------------------------
Current time: 2020-04-12 22:55:58 UTC
Time Source:  LOCAL CLOCK 
System booted: 2020-04-12 22:13:01 UTC (00:42:57 ago)
Last configured: 2020-04-12 22:18:21 UTC (00:37:37 ago) by root
10:55PM  up 43 mins, 1 user, load averages: 1.00, 0.98, 0.92

root@vqfx# run show chassis fpc 0    
Aborted! This command can only be used on the master routing engine.

root@vqfx# run show interfaces terse | grep xe-0 

root@vqfx# show interfaces | display set 
set interfaces xe-0/0/0 unit 0 family inet address 10.18.8.5/31
set interfaces em0 unit 0 family inet dhcp
set interfaces em1 unit 0 family inet address 169.254.0.2/24
sliddjur commented 4 years ago

And on another boot.

Message from syslogd@vqfx-re at Apr 12 23:23:21  ...
vqfx-re olive-ultimat.elf: Frame 4: sp = 0xd67e948, pc = 0x8057f40

Message from syslogd@vqfx-re at Apr 12 23:23:21  ...
vqfx-re olive-ultimat.elf: Frame 5: sp = 0xd67e950, pc = 0x0

{master:0}
root@vqfx-re> show chassis fpc         
                     Temp  CPU Utilization (%)   CPU Utilization (%)  Memory    Utilization (%)
Slot State            (C)  Total  Interrupt      1min   5min   15min  DRAM (MB) Heap     Buffer
  0  Empty           
  1  Empty           
  2  Empty           
  3  Empty           
  4  Empty           
  5  Empty           
  6  Empty           
  7  Empty           
  8  Empty           
  9  Empty 

And still no xe-0/0/0 working on em3 in gns3.

sliddjur commented 4 years ago

To me, it seems to always work on first boot, but I am having various issues on boots after that. If I setup a new pair of pfe and re, it works until rebooted.

However, I can see there is a connection between pfe and re, and they can ping each other. There is communication to each other on tcp/3000 which the PFE is listening on.

image

anubisg1 commented 4 years ago

i'm having a weird problem with the vQFX. On first boot everything works just fine, but after i reboot the appliances (power off followed by power on), the RE complains that it isn't the master of a VC, and the PFE is moved from slot 0 to slot 1 (basically breaking all existing config, since it's applied on different ports now)

root@LEAF-3:LC:1% cli
warning: This chassis is operating in a non-master role as part of a virtual-chassis (VC) system.
warning: Use of interactive commands should be limited to debugging and VC Port operations.
warning: Full CLI access is provided by the Virtual Chassis Master (VC-M) chassis.
warning: The VC-M can be identified through the show virtual-chassis status command executed at this console.
warning: Please logout and log into the VC-M to use CLI.
{linecard:1}

root@LEAF-3> request virtual-chassis reactivate    

This member split from a virtual chassis. Please make sure that no active
switch belonging to this virtual chassis has conflicting configuration.

Do you want to continue ? [yes,no] (no) yes 
root@LEAF-3:RE:1% cli
{master:1}

root@LEAF-3> show chassis fpc 
                     Temp  CPU Utilization (%)   CPU Utilization (%)  Memory    Utilization (%)
Slot State            (C)  Total  Interrupt      1min   5min   15min  DRAM (MB) Heap     Buffer
  0  Empty           
  1  Online           Testing  87        12        0      0      0    1920        0         42
  2  Empty           
  3  Empty           
  4  Empty           
  5  Empty           
  6  Empty           
  7  Empty           
  8  Empty           
  9  Empty           

{master:1}
root@LEAF-3> show interfaces terse 
Interface               Admin Link Proto    Local                 Remote
gr-0/0/0                up    up
pfe-1/0/0               up    up
pfe-1/0/0.16383         up    up   inet    
                                   inet6   
pfh-1/0/0               up    up
pfh-1/0/0.16383         up    up   inet    
pfh-1/0/0.16384         up    up   inet    
xe-1/0/0                up    up
xe-1/0/0.16386          up    up  
xe-1/0/1                up    up
xe-1/0/1.16386          up    up  
xe-1/0/2                up    up
xe-1/0/2.16386          up    up  
xe-1/0/3                up    up
xe-1/0/3.16386          up    up  
xe-1/0/4                up    up
xe-1/0/4.16386          up    up  
xe-1/0/5                up    up
xe-1/0/5.16386          up    up  
xe-1/0/6                up    up
xe-1/0/6.16386          up    up  
xe-1/0/7                up    up
---(more)---
sliddjur commented 4 years ago

i'm having a weird problem with the vQFX. On first boot everything works just fine, but after i reboot the appliances (power off followed by power on), the RE complains that it isn't the master of a VC, and the PFE is moved from slot 0 to slot 1 (basically breaking all existing config, since it's applied on different ports now)

@anubisg1 this sounds like my problem aswell. I did't try the request virtual-chassis reactivate command though. What vQFX version are you using? I tried many different, 18.4, 18.1, 19.4 and 17.4r1. Only the 17.4r1 seems to work properly after reboot, even though all of them have the same settings in the gns3-template.

I use these files:

gns3@gns3vm:/opt/gns3/images/QEMU$ md5sum cosim-18.4R1.8_20180212.qcow2
0372e9c1b7df3608099186ab8cbbf2ad  cosim-18.4R1.8_20180212.qcow2
gns3@gns3vm:/opt/gns3/images/QEMU$ md5sum jinstall-vqfx-10-f-17.4R1.16.img
dd83313b0f5beaf68488ed3d5e1e5240  jinstall-vqfx-10-f-17.4R1.16.img
anubisg1 commented 4 years ago

i'm having a weird problem with the vQFX. On first boot everything works just fine, but after i reboot the appliances (power off followed by power on), the RE complains that it isn't the master of a VC, and the PFE is moved from slot 0 to slot 1 (basically breaking all existing config, since it's applied on different ports now)

@anubisg1 this sounds like my problem aswell. I did't try the request virtual-chassis reactivate command though. What vQFX version are you using? I tried many different, 18.4, 18.1, 19.4 and 17.4r1. Only the 17.4r1 seems to work properly after reboot, even though all of them have the same settings in the gns3-template.

I use these files:

gns3@gns3vm:/opt/gns3/images/QEMU$ md5sum cosim-18.4R1.8_20180212.qcow2
0372e9c1b7df3608099186ab8cbbf2ad  cosim-18.4R1.8_20180212.qcow2
gns3@gns3vm:/opt/gns3/images/QEMU$ md5sum jinstall-vqfx-10-f-17.4R1.16.img
dd83313b0f5beaf68488ed3d5e1e5240  jinstall-vqfx-10-f-17.4R1.16.img

hello. first of all remember that on 19.4 your PFE image is no longer the "cosim". regardless thought, the way i made it work was:

1) request virtual-chassis reactivate 2) request system reboot

then it gets back to normal

sliddjur commented 4 years ago

hello. first of all remember that on 19.4 your PFE image is no longer the "cosim". regardless thought, the way i made it work was:

  1. request virtual-chassis reactivate
  2. request system reboot

then it gets back to normal

@anubisg1 yeah I have these images for the other files. PFE for 18 and 19 is actually the same from Juniper websites. Which vqfx versions are you using? When you reactivate and then request system reboot. Will it break again after you do a poweroff in gns3?

d8b68ba6b8c987717f5298ba9292d3a4  vqfx-18.4R2-S2-2019010209-pfe-qemu.qcow
437032f226a47e82e3a924ba7a4134c8  vqfx-18.4R2-S2.3-re-qemu.qcow2

d8b68ba6b8c987717f5298ba9292d3a4  vqfx-19.4R1-2019010209-pfe-qemu.qcow
42aa81054ad378cb5480c865ed42b7be  vqfx-19.4R1.10-re-qemu.qcow2
anubisg1 commented 4 years ago

@anubisg1 yeah I have these images for the other files. PFE for 18 and 19 is actually the same from Juniper websites. Which vqfx versions are you using? When you reactivate and then request system reboot. Will it break again after you do a poweroff in gns3?

yes, it will. you need to do it every time you boot the vm. but the good thing is that you don't need to wait for the FPC to come online. you can do that as soon as the RE image comes up

sliddjur commented 4 years ago

yes, it will. you need to do it every time you boot the vm. but the good thing is that you don't need to wait for the FPC to come online. you can do that as soon as the RE image comes up

Ok great, this will be good information for future websearches. However, I have no idea how to get around the reboot issue. Hopefully the GNS3 team can pinpoint something for us.

adam-kulagowski commented 4 years ago

Unfortunately using virtio-net-pci w/ vQFX RE (versions 18.4 and 19.4) does not work.

For example LLDP is flaky (communication is only in one direction or there is no neighbor adjency at all). This is highly random: ~ in 25% everything works fine.

However when I start manually crafted QFX RE on KVM w/ e1000 interfaces - everything works without an issue. What is more interfesting is that when I copy the exact CLI from ps auxww that was launched by GNS3 machine and launch that RE by hand from bash I see the FPC is coming up fine. So this is not KVM but GNS3 issue.

If I should open a new issue for this behavior please let me know.

adam-kulagowski commented 4 years ago

Found the issue. After VM is booted up under GNS3 its links are brought down:

INFO qemu_vm.py:1109 Connected to QEMU monitor on 127.0.0.1:44587 after 0.0106 seconds
INFO qemu_vm.py:1174 Execute QEMU monitor command: set_link gns3-2 off
INFO qemu_vm.py:1174 Execute QEMU monitor command: set_link gns3-3 off
INFO qemu_vm.py:1174 Execute QEMU monitor command: set_link gns3-4 off
INFO qemu_vm.py:1174 Execute QEMU monitor command: set_link gns3-5 off
INFO qemu_vm.py:1174 Execute QEMU monitor command: set_link gns3-6 off
INFO qemu_vm.py:1174 Execute QEMU monitor command: set_link gns3-7 off
INFO qemu_vm.py:1174 Execute QEMU monitor command: set_link gns3-8 off
INFO qemu_vm.py:1174 Execute QEMU monitor command: set_link gns3-9 off
INFO qemu_vm.py:1174 Execute QEMU monitor command: set_link gns3-10 off
INFO qemu_vm.py:1174 Execute QEMU monitor command: set_link gns3-11 off
INFO qemu_vm.py:1174 Execute QEMU monitor command: set_link gns3-12 off

This breaks vQFX w/ e1000.

Commenting lines in for loop in qemu_vm.py file inside _control_vm_commands function fixes the issue.

grossmj commented 4 years ago

I think we should provide a way to deactivate that.

loco11011 commented 4 years ago

Yo @grossmj , may i know, are there any plans to fix this issue?

grossmj commented 4 years ago

Yes, we will try to fix this soon.

adam-kulagowski commented 4 years ago

@loco11011 if You don't want to comment the code (or wait for the fix) You can use this workarounds: First find which TCP port is allocated for QFX RE monitor: 1) do ps axuww on GNS3 server 2) search for Your RE VM 3) try to find the following string -monitor tcp:127.0.0.1:39321,server,nowait (port can be different) 4) bring all interfaces up using snippet below (user the port from step 3):

cat <<END | nc -N 127.0.0.1 39321
set_link gns3-1 on
set_link gns3-2 on
set_link gns3-3 on
set_link gns3-4 on
set_link gns3-5 on
set_link gns3-6 on
set_link gns3-7 on
set_link gns3-8 on
set_link gns3-9 on
set_link gns3-10 on
set_link gns3-11 on
END

OR:

1) connect all vQFX interfaces to some some dummy switch

In both cases the vQFX came up w/ FPC. If You were to slow w/ first solution You can always restart the PFE using restart chassis-control CLI on RE.

grossmj commented 4 years ago

For info, the fix is ready and will be available in our next release later this week: https://github.com/GNS3/gns3-server/issues/1767

loco11011 commented 4 years ago

@adam-kulagowski and @grossmj , appreciate it!

i tried manually, but gns3 told me that -N is an invalid option within "cat <<END | nc -N 127.0.0.1 39321" ,maybe because the gns3 vm i use is not the most latest. (which i will update).

but the issue i am facing is that i get an endless tcp loop between RE and PFE like @adosztal showed above, does anyone know if this is related to the same issue about the qemu_vm.py?

edit: with TCP loop i mean that the RE sends tcp resets to the iniating tcp connections of the PFE.

tcp-loop-kn-loos

adam-kulagowski commented 4 years ago

@loco11011 I'm using ubuntu 18.04 for GNS3 VM. Try second solution (attaching all QFX interfaces to dummy switch - this will force all interface top be in UP state on GNS3).

What You are seeing is that: 1) RE initiate connection to daemon on PFE VM on port 3000 2) during information exchange PFE daemon crashes 3) port 3000 becomes unavailable till watchdog on PFE VM restarts crashed process (takes some time) - this is most likely the part You captured on Wireshark 4) at some point PFE restarts daemon 5) go point 1

This loop executes itself few times (<10). After that RE writes in log that packet-forwarding-engine keeps trashing and stops restarting it. At this point You have to either reboot RE VM or issue restart chassis-control

loco11011 commented 4 years ago

@adam-kulagowski , i am experiencing it the other way around, it is the RE that sends the TCP resets. i tried the second solution but it still did not manage to work.

i am using ver 18.1R1.9 , and the weird thing is that i have to manually change the em1 ip address of one of the appliances (169.254.0/24) because by default they have been pre configured with the exact same ip (169.254.0.2)

i am guessin that this version of vQfx is not stable

edit: sometimes it is the RE that sends the TCP resets and the other time it is the PFE, maybe this version is not stable

adam-kulagowski commented 4 years ago

@loco11011 I recommend to go w/ 19.4R1 available at Juniper web page. I think (never tried that myself) You can extract disk images from vagrant boxes (available freely). However I think we are going slightly off-topic.

loco11011 commented 4 years ago

@adam-kulagowski I appreciate it

ekdr commented 4 years ago

For info, the fix is ready and will be available in our next release later this week: GNS3/gns3-server#1767

It seems original issue is some sort of back, myself and a mate are haing issues with vQFX, vMX seems to be ok, but vQFX with cofing from aplliance e1000 and virtio interfaces for RE/PFE does not work properly, interfaces come up but there is no traffic, I belive the issue was introduced originally on 2.2 and seems to be back on 2.2.10, can you or somebody else confirm if vQFX broke again, at least it did it for us?

ekdr commented 4 years ago

@grossmj @adosztal I can provide you guys the vMX pre-release image in case you are still wanting/willing to create the appliance for it, let me know

grossmj commented 4 years ago

@grossmj @adosztal I can provide you guys the vMX pre-release image in case you are still wanting/willing to create the appliance for it, let me know

Do you mean to create a vMX appliance for the latest version just before they moved to the vCP/vFP model? It would be interesting indeed.

I believe the issue was introduced originally on 2.2 and seems to be back on 2.2.10, can you or somebody else confirm if vQFX broke again, at least it did it for us?

Did it work in v2.2.8 or v2.2.9?

ekdr commented 4 years ago

@grossmj @adosztal I can provide you guys the vMX pre-release image in case you are still wanting/willing to create the appliance for it, let me know

Do you mean to create a vMX appliance for the latest version just before they moved to the vCP/vFP model? It would be interesting indeed.

I believe the issue was introduced originally on 2.2 and seems to be back on 2.2.10, can you or somebody else confirm if vQFX broke again, at least it did it for us?

Did it work in v2.2.8 or v2.2.9?

Correct the latest that I know of right before vCP/VFP, I will upload it and share a Google drive link, I guess that should work for you.

Unfortunately I don't know about .8 or .9 as I may have skipped those, I know it worked on a couple of versions after the original issue with e1000/virtio was fixed, not sure which versions i may have skipped, but I know for sure it worked on a couple of versions and then they broke again just recently after .10 for me and a friend of mine

grossmj commented 4 years ago

Correct the latest that I know of right before vCP/VFP, I will upload it and share a Google drive link, I guess that should work for you.

Yep that would work for us. Thanks :)

Unfortunately I don't know about .8 or .9 as I may have skipped those, I know it worked on a couple of versions after the original issue with e1000/virtio was fixed, not sure which versions i may have skipped, but I know for sure it worked on a couple of versions and then they broke again just recently after .10 for me and a friend of mine

Ok, I am going to investigate this.

ekdr commented 4 years ago

Correct the latest that I know of right before vCP/VFP, I will upload it and share a Google drive link, I guess that should work for you.

Yep that would work for us. Thanks :)

Unfortunately I don't know about .8 or .9 as I may have skipped those, I know it worked on a couple of versions after the original issue with e1000/virtio was fixed, not sure which versions i may have skipped, but I know for sure it worked on a couple of versions and then they broke again just recently after .10 for me and a friend of mine

Ok, I am going to investigate this.

Thank you guys for taking the time to do this, please let me know where I should send the link to? I guess we cant share images here. I have taken an screenshot of my settings for the pre-released version, I know it works with even less that 512mb of ram, but I like to use 1024mb, ethernet ports start from interface 2 (first two are for internal purposes as the other images), these are ge interfaces, but well you guys may know better than me Screenshot from 2020-06-23 22-59-47

Regarding the vQFX issue, interfaces come up and show fine as xe interfaces, but ping does not work, which makes me believe the issue is between the RE and the PFE link

grossmj commented 4 years ago

I tried the vMX pre-release jinstall-vmx-14.1R4.8-domestic.bin image and here are some of my findings.

Using the image with virtio-net-pci adapters

I get the following errors when the image boots:

em0error setting host MAC filter table
em1error setting host MAC filter table
em1error setting host MAC filter table
em1error setting host MAC filter table
em1error setting host MAC filter table
em10error setting host MAC filter table
em11error setting host MAC filter table
em2error setting host MAC filter table
em3error setting host MAC filter table
em4error setting host MAC filter table
em5error setting host MAC filter table
em6error setting host MAC filter table
em7error setting host MAC filter table
em8error setting host MAC filter table
em9error setting host MAC filter table

PIC 0 is online:

root> show chassis fpc pic-status    
Slot 0   Online       Virtual FPC
  PIC 0  Online       Virtual 10x1GE PIC

and I have 12 interfaces em0 to em11:

root> show interfaces terse em*  
Interface               Admin Link Proto    Local                 Remote
em0                     up    up
em1                     up    up
em1.0                   up    up   inet     172.16.0.1/16   
                                   inet6    fe80::eef:6fff:fe75:4201/64
em2                     up    up
em3                     up    up
em4                     up    up
em5                     up    up
em6                     up    up
em7                     up    up
em8                     up    up
em9                     up    up
em10                    up    up
em11                    up    up

Booting with or without the "replicate network connection states in Qemu" option doesn't seem to change anything.

Using the image with e1000 adapters without the "replicate network connection states in Qemu" option

I don't see any error like with virtio-net-pci adapters.

I have interfaces em1 to em11:

root> show interfaces terse em*    
Interface               Admin Link Proto    Local                 Remote
em1                     up    up
em1.0                   up    up   inet     172.16.0.1/16   
                                   inet6    fe80::eef:6fff:fe81:d001/64
em2                     up    up
em3                     up    up
em4                     up    up
em5                     up    up
em6                     up    up
em7                     up    up
em8                     up    up
em9                     up    up
em10                    up    up
em11                    up    up

and I don't see any em0 but a fxp0 interface instead:

root> show interfaces terse fxp* 
Interface               Admin Link Proto    Local                 Remote
fxp0                    up    up

PIC 0 is online:

root> show chassis fpc pic-status    
Slot 0   Online       Virtual FPC
  PIC 0  Online       Virtual 10x1GE PIC

Using the image with e1000 adapters with the "replicate network connection states in Qemu" option

The appliance boots and shows:

root> show chassis fpc pic-status    
Slot 0   Online       Virtual FPC

A few minutes later PIC 0 is offline and then keep looping between online and offline states.

root> show chassis fpc pic-status    
Slot 0   Offline      Virtual FPC

In all situation, I could ping em2 from a VPCS node:

root# set interfaces em2 unit 0 family inet address 10.0.0.2/8
PC1> ping 10.0.0.2
84 bytes from 10.0.0.2 icmp_seq=1 ttl=64 time=2.016 ms
84 bytes from 10.0.0.2 icmp_seq=2 ttl=64 time=1.027 ms
84 bytes from 10.0.0.2 icmp_seq=3 ttl=64 time=0.884 ms
84 bytes from 10.0.0.2 icmp_seq=4 ttl=64 time=0.951 ms
84 bytes from 10.0.0.2 icmp_seq=5 ttl=64 time=0.828 ms

Questions

What is the difference between having an em0 and fpx0 interface? Does it matter?

Having the "replicate network connection states in Qemu" option activated when using e1000 adapters definitely breaks the appliance. virtio-net-pci seems fine with that.

So what config should we use? e1000 adapters without replication or just virtio-net-pci?