gravitl / netmaker

Netmaker makes networks with WireGuard. Netmaker automates fast, secure, and distributed virtual networks.
https://netmaker.io
Other
9.19k stars 537 forks source link

[Bug]: nm-upgrade-0-17-1-to-0-19-0.sh not working #2499

Open cryply opened 11 months ago

cryply commented 11 months ago

Contact Details

cryptomopher@gmail.com

What happened?

latest upgrade script not working.

I have test NM network. server + 2 clients nodes under 0.17.1

Using latest (0.20.5) ./nm-upgrade-0-17-1-to-0-19-0.sh I am trying to upgrade my server. During start of containers I am getting errors:

...starting containers ERROR: The Compose file '/root/docker-compose.yml' is invalid because: Unsupported config option for services.volumes: '.turn_server' services.turn.volumes contains an invalid type, it should be an array services.turn.environment.DEBUG_MODE contains false, which is an invalid type, it should be a string, number, or a null

Reason is that generated docker-compose contains empty fields.

turn:
container_name: turn
image: gravitl/turnserver:v1.0.0
network_mode: host
volumes: null
environment:
DEBUG_MODE: off
VERBOSITY: "1"
TURN_PORT: "3479"
TURN_API_PORT: "8089"
CORS_ALLOWED_ORIGIN: '*'
TURN_SERVER_HOST: turn.nm.myniceip.nip.io
TURN_USERNAME: netmaker
TURN_PASSWORD: igeu4mgjrwerqwerwertIqKc1jgYbDrQ
volumes:
.turn_server: '{}'
volumes:

Version

v0.20.5

What OS are you using?

Debian 12

Relevant log output

No response

Contributing guidelines

cryply commented 11 months ago

I did small fixes in upgrade script it worked till the end. Commented following lines:

#yq ".services.turn.volumes += {\"turn_server:/etc/config\"}" -i $INSTALL_PATH/docker-compose.yml  
#yq ".services.turn.environment += {\"DEBUG_MODE\": \"off\"}" -i $INSTALL_PATH/docker-compose.yml  
#yq ".services.volumes += {\".turn_server\": \"{}\"}" -i $INSTALL_PATH/docker-compose.yml  

After upgrade all client nodes disappeared. Reason is: while nodes table in sqlite db has all old records, their JSON format completely different. Seems there no upgrades done on sqlite db.

cryply commented 11 months ago

Later I got 0.20.5 clients installed to my node hosts. They automatically got into new network! Wow

But my nodes got new IPs:

before upgrade:

netmaker-1 10.88.255.254
v2                  10.88.0.1
vultrguest    10.88.0.2

after upgrade

v1 (node with server was not part of network in 0.17.1) 10.88.0.1
v2               10.88.0.3
vultrguest 10.88.0.2

Sometimes DNS not working and keeping same IPs(not just network) not bad idea

mattkasun commented 10 months ago

version v0.20.6 has new upgrade process.

cryply commented 10 months ago

Nothing was changed. Same issues.

RROR: The Compose file '/root/docker-compose.yml' is invalid because: Unsupported config option for services.volumes: '.turn_server' services.turn.volumes contains an invalid type, it should be an array services.turn.environment.DEBUG_MODE contains false, which is an invalid type, it should be a string, number, or a null

abhishek9686 commented 10 months ago

Nothing was changed. Same issues.

RROR: The Compose file '/root/docker-compose.yml' is invalid because: Unsupported config option for services.volumes: '.turn_server' services.turn.volumes contains an invalid type, it should be an array services.turn.environment.DEBUG_MODE contains false, which is an invalid type, it should be a string, number, or a null

@cryply did you follow the latest upgrade steps https://docs.netmaker.io/upgrades.html#id2 to upgrade from v0.17.1 to latest

cryply commented 10 months ago

Hi Abishek

I managed to upgrade my 0.17.1 network to latest one using link you provided. https://docs.netmaker.io/upgrades.html#id2

Could you some how update or even remove script which is not working from source tree:

https://github.com/gravitl/netmaker/blob/master/scripts/nm-upgrade-0-17-1-to-0-19-0.sh

It is not working and misleading.

cryply commented 10 months ago

Another thing with latest upgrade script.

while it is working in general it has annoying side effect.

Before upgrade I have test network with 3 real node:

v1.test 10.88.0.3 (Running on same host as netmaker) v2.test 10.88.0.1 vultr.test 10.88.0.2 and default netmaker host: netmaker.test 10.88.255.254

after upgrade to latest 0.20.6

v1.test became 10.88.255.254/16 v2.test 10.88.0.1/16 vultr.test 10.88.0.2/16

netmaker.test disappeared.

Q: why v1 has changed its IP?

cryply commented 10 months ago

More updates. Updated clients were working for one day. Now they have Unknown health state. netclient pull did not help:

` root@vultr:~# netclient pull

2023/08/22 12:00:12 INFO migration func

2023/08/22 12:00:12 INFO stopping daemon

2023/08/22 12:00:12 WARN netclient daemon not installed

2023/08/22 12:00:12 INFO Sending signal=terminated "to PID"="&{Pid:24854 handle:0 isdone:{:{} v:0} sigMu:{w:{state:0 sema:0} writerSem:0 readerSem:0 readerCount:{:{} v:0} readerWait:{_:{} v:0}}}"

2023/08/22 12:00:12 INFO networks to be migrated test

2023/08/22 12:00:12 INFO migrating test

2023/08/22 12:00:12 processing /etc/netclient/config/netconfig-test

[netclient] 2023-08-22 12:00:12 error running command: /sbin/ip link del nm-test

[netclient] 2023-08-22 12:00:12 Cannot find device "nm-test"

[netclient] 2023-08-22 12:00:12 error removing interface nm-test exit status 1

2023/08/22 12:00:12 INFO server migration server=nm.xxx-yyy-178-75.nip.io

2023/08/22 12:00:12 ERROR migration response error="httpclient: http.Status not OK"

2023/08/22 12:00:12 ERROR status error code=400 message="legacy node not found no result found"

2023/08/22 12:00:12 INFO Sending signal=hangup "to PID"="&{Pid:24854 handle:0 isdone:{:{} v:0} sigMu:{w:{state:0 sema:0} writerSem:0 readerSem:0 readerCount:{:{} v:0} readerWait:{_:{} v:0}}}" completed pull for server nm.xxx-yyy-178-75.nip.io

[netclient] 2023-08-22 12:00:13 failed to pull hangup failed -- os: process already finished `

wg show

just return empty answer

Refresh Host Keys not helping either

cryply commented 10 months ago

Ultimately I deleted these non working nodes in Netmaker UI. did netclient join ... still not working. only after systemctl restart netclient they start working and appears in netmaker UI with new IPs of course

mattkasun commented 9 months ago

2606 fixes