acantril / learn-cantrill-io-labs

Standard and Advanced Demos for learn.cantrill.io courses
MIT License
5.62k stars 2.14k forks source link

Cannot start FRRouting service in AWS_HYBRID_AdvancedVPN Lab #5

Closed cuongtruonghcm closed 4 years ago

cuongtruonghcm commented 4 years ago

At stage 4, after run ./ffrouting-install.sh and reboot server. The service frr cannot start, I'm trying to fix this error many time but it still cannot.

acantril commented 4 years ago

do you have the latest version of the repo with the hotfix ? the chmod 740 comment which needs to be run right after the reboot ?

On Mon, 26 Oct 2020 at 12:35, Cuong Truong notifications@github.com wrote:

At stage 4, after run ./ffrouting-install.sh and reboot server. The service frr cannot start, I'm trying to fix this error many time but it still cannot.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/acantril/learn-cantrill-io-labs/issues/5, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADUFJLNI674EC6AZKKU4NPLSMTN7DANCNFSM4S6ZC7BA .

cuongtruonghcm commented 4 years ago

I think I already got the latest version because at STAGE 1 I provisioning the environment by your cloudformation files, one for OnPrem and one for AWS. Actually, I'm following your guide step by step, and I haven't seen chmod 740 command, could you pls describe more base on your guide?

acantril commented 4 years ago

If you don't see the chmod 740 thing in the instructions ... you need the latest copy of the repo, which includes updated instructions.

On Mon, 26 Oct 2020 at 14:13, Cuong Truong notifications@github.com wrote:

I think I already got the latest version because at STAGE 1 I provisioning the environment by your cloudformation files, one for OnPrem and one for AWS. Actually, I'm following your guide step by step, and I haven't seen chmod 740 command, could you pls describe more base on your guide?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/acantril/learn-cantrill-io-labs/issues/5#issuecomment-716292362, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADUFJLPEBA7JLZ54OSSKD5TSMTZPNANCNFSM4S6ZC7BA .

cuongtruonghcm commented 4 years ago

Okie, tks for your update, I can see your new guide but actually yesterday I also tried to fix this issue by myself (chmod 755 /var/run/frr ), the service ran well but cannot ping and the status on AWS Console -> S2S VPN connection still DOWN

This afternoon, I also just done again (STAGE 4) the lab follow your new guide but still cannot ping and status also is DOWN

root@ip-192-168-12-49:/var/snap/amazon-ssm-agent/1566# systemctl status frr ● frr.service - FRRouting Loaded: loaded (/etc/systemd/system/frr.service; enabled; vendor preset: enabled) Active: active (running) since Mon 2020-10-26 07:59:01 UTC; 6min ago Docs: https://frrouting.readthedocs.io/en/latest/setup.html Process: 1102 ExecStop=/usr/lib/frr/frrinit.sh stop (code=exited, status=0/SUCCESS) Process: 1418 ExecStart=/usr/lib/frr/frrinit.sh start (code=exited, status=0/SUCCESS) Status: "FRR Operational" Tasks: 13 (limit: 2287) CGroup: /system.slice/frr.service ├─1435 /usr/lib/frr/watchfrr -d -F traditional zebra bgpd staticd ├─1456 /usr/lib/frr/zebra -d -F traditional -A 127.0.0.1 -s 90000000 ├─1464 /usr/lib/frr/bgpd -d -F traditional -A 127.0.0.1 -M rpki └─1475 /usr/lib/frr/staticd -d -F traditional -A 127.0.0.1

Oct 26 07:59:00 ip-192-168-12-49 watchfrr.sh[1452]: Cannot stop zebra: pid file not found Oct 26 07:59:00 ip-192-168-12-49 zebra[1456]: client 27 says hello and bids fair to announce only bgp routes vrf=0 Oct 26 07:59:00 ip-192-168-12-49 zebra[1456]: client 30 says hello and bids fair to announce only vnc routes vrf=0 Oct 26 07:59:01 ip-192-168-12-49 zebra[1456]: client 37 says hello and bids fair to announce only static routes vrf=0 Oct 26 07:59:01 ip-192-168-12-49 watchfrr[1435]: zebra state -> up : connect succeeded Oct 26 07:59:01 ip-192-168-12-49 watchfrr[1435]: bgpd state -> up : connect succeeded Oct 26 07:59:01 ip-192-168-12-49 watchfrr[1435]: staticd state -> up : connect succeeded Oct 26 07:59:01 ip-192-168-12-49 watchfrr[1435]: all daemons up, doing startup-complete notify Oct 26 07:59:01 ip-192-168-12-49 frrinit.sh[1418]: * Started watchfrr Oct 26 07:59:01 ip-192-168-12-49 systemd[1]: Started FRRouting.

root@ip-192-168-12-49:/var/snap/amazon-ssm-agent/1566# route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface default ip-192-168-12-1 0.0.0.0 UG 100 0 0 ens5 169.254.153.48 0.0.0.0 255.255.255.252 U 0 0 0 vti1 169.254.237.172 0.0.0.0 255.255.255.252 U 0 0 0 vti2 192.168.10.0 0.0.0.0 255.255.255.0 U 0 0 0 ens6 192.168.12.0 0.0.0.0 255.255.255.0 U 0 0 0 ens5 ip-192-168-12-1 0.0.0.0 255.255.255.255 UH 100 0 0 ens5

root@ip-192-168-12-49:/var/snap/amazon-ssm-agent/1566# ifconfig vti1: flags=209<UP,POINTOPOINT,RUNNING,NOARP> mtu 1436 inet 169.254.153.50 netmask 255.255.255.252 destination 169.254.153.49 inet6 fe80::5efe:c0a8:c31 prefixlen 64 scopeid 0x20 tunnel txqueuelen 1000 (IPIP Tunnel) RX packets 8 bytes 546 (546.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 8 bytes 396 (396.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

vti2: flags=209<UP,POINTOPOINT,RUNNING,NOARP> mtu 1436 inet 169.254.237.174 netmask 255.255.255.252 destination 169.254.237.173 inet6 fe80::5efe:c0a8:c31 prefixlen 64 scopeid 0x20 tunnel txqueuelen 1000 (IPIP Tunnel) RX packets 12 bytes 815 (815.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 12 bytes 588 (588.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

acantril commented 4 years ago

if the VPNs are down, you have a typo somewhere it works 100% fine for me now, both tunnels on both connections are up.

On Mon, 26 Oct 2020 at 18:11, Cuong Truong notifications@github.com wrote:

Okie, tks for your update, I can see your new guide but actually yesterday I also tried to fix this issue by myself (chmod 755 /var/run/frr ), the service ran well but cannot ping and the status on AWS Console -> S2S VPN connection still DOWN

This afternoon, I also just done again (STAGE 4) the lab follow your new guide but still cannot ping and status also is DOWN

root@ip-192-168-12-49:/var/snap/amazon-ssm-agent/1566# systemctl status frr ● frr.service - FRRouting Loaded: loaded (/etc/systemd/system/frr.service; enabled; vendor preset: enabled) Active: active (running) since Mon 2020-10-26 07:59:01 UTC; 6min ago Docs: https://frrouting.readthedocs.io/en/latest/setup.html Process: 1102 ExecStop=/usr/lib/frr/frrinit.sh stop (code=exited, status=0/SUCCESS) Process: 1418 ExecStart=/usr/lib/frr/frrinit.sh start (code=exited, status=0/SUCCESS) Status: "FRR Operational" Tasks: 13 (limit: 2287) CGroup: /system.slice/frr.service ├─1435 /usr/lib/frr/watchfrr -d -F traditional zebra bgpd staticd ├─1456 /usr/lib/frr/zebra -d -F traditional -A 127.0.0.1 -s 90000000 ├─1464 /usr/lib/frr/bgpd -d -F traditional -A 127.0.0.1 -M rpki └─1475 /usr/lib/frr/staticd -d -F traditional -A 127.0.0.1

Oct 26 07:59:00 ip-192-168-12-49 watchfrr.sh[1452]: Cannot stop zebra: pid file not found Oct 26 07:59:00 ip-192-168-12-49 zebra[1456]: client 27 says hello and bids fair to announce only bgp routes vrf=0 Oct 26 07:59:00 ip-192-168-12-49 zebra[1456]: client 30 says hello and bids fair to announce only vnc routes vrf=0 Oct 26 07:59:01 ip-192-168-12-49 zebra[1456]: client 37 says hello and bids fair to announce only static routes vrf=0 Oct 26 07:59:01 ip-192-168-12-49 watchfrr[1435]: zebra state -> up : connect succeeded Oct 26 07:59:01 ip-192-168-12-49 watchfrr[1435]: bgpd state -> up : connect succeeded Oct 26 07:59:01 ip-192-168-12-49 watchfrr[1435]: staticd state -> up : connect succeeded Oct 26 07:59:01 ip-192-168-12-49 watchfrr[1435]: all daemons up, doing startup-complete notify Oct 26 07:59:01 ip-192-168-12-49 frrinit.sh[1418]: * Started watchfrr Oct 26 07:59:01 ip-192-168-12-49 systemd[1]: Started FRRouting.

root@ip-192-168-12-49:/var/snap/amazon-ssm-agent/1566# route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface default ip-192-168-12-1 0.0.0.0 UG 100 0 0 ens5 169.254.153.48 0.0.0.0 255.255.255.252 U 0 0 0 vti1 169.254.237.172 0.0.0.0 255.255.255.252 U 0 0 0 vti2 192.168.10.0 0.0.0.0 255.255.255.0 U 0 0 0 ens6 192.168.12.0 0.0.0.0 255.255.255.0 U 0 0 0 ens5 ip-192-168-12-1 0.0.0.0 255.255.255.255 UH 100 0 0 ens5

root@ip-192-168-12-49:/var/snap/amazon-ssm-agent/1566# ifconfig vti1: flags=209<UP,POINTOPOINT,RUNNING,NOARP> mtu 1436 inet 169.254.153.50 netmask 255.255.255.252 destination 169.254.153.49 inet6 fe80::5efe:c0a8:c31 prefixlen 64 scopeid 0x20 tunnel txqueuelen 1000 (IPIP Tunnel) RX packets 8 bytes 546 (546.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 8 bytes 396 (396.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

vti2: flags=209<UP,POINTOPOINT,RUNNING,NOARP> mtu 1436 inet 169.254.237.174 netmask 255.255.255.252 destination 169.254.237.173 inet6 fe80::5efe:c0a8:c31 prefixlen 64 scopeid 0x20 tunnel txqueuelen 1000 (IPIP Tunnel) RX packets 12 bytes 815 (815.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 12 bytes 588 (588.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/acantril/learn-cantrill-io-labs/issues/5#issuecomment-716385187, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADUFJLKGCLLIUO3LF7R3CUDSMUVKBANCNFSM4S6ZC7BA .

cuongtruonghcm commented 4 years ago

It's working now, your labs are awesome, tks so much

Previously some guide makes me get misunderstood: CONN1_TUNNEL1_ONPREM_INSIDE_IP (ensuring the /30 is at the end) CONN1_TUNNEL1_AWS_INSIDE_IP (ensuring the /30 is at the end)

And in "DemoValueTemplate.md" For CONN1_TUNNEL1_AWS_BGP_IP the value is the same as CONN1_TUNNEL1_AWS_INSIDE_IP For CONN1_TUNNEL2_AWS_BGP_IP the value is the same as CONN1_TUNNEL2_AWS_INSIDE_IP

So in Stage 4 -> use vtysh for config router: ... router bgp 65016 neighbor CONN1_TUNNEL1_AWS_BGP_IP remote-as 64512 ...

I input: neighbor 169.254.162.157/30 remote-as 64512 instead of neighbor 169.254.162.157 remote-as 64512

Tks again Acantril