Closed yaronkaikov closed 1 year ago
I think we should add --no-ntp-setup on the tail of run('/opt/scylladb/scripts/scylla_setup ...') line, not sysconfig_opt, since we disable it on all clouds.
Yes, you are right, fixed
Verified with https://jenkins.scylladb.com/job/scylla-master/job/releng-testing/job/next-machine-image/78/
Azure image configuration:
azureuser@yaron-test2:~$ chronyc sources
MS Name/IP address Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0 0 3 377 10 +8087ns[ +16us] +/- 1526ns
GCP image configuration:
yaronkaikov@instance-1:~$ chronyc sources
MS Name/IP address Stratum Poll Reach LastRx Last sample
===============================================================================
^* metadata.google.internal 2 6 37 47 -13us[-5849us] +/- 318us
AWS image configuration:
scyllaadm@ip-10-99-17-28:~$ chronyc sources
MS Name/IP address Stratum Poll Reach LastRx Last sample
===============================================================================
^* 169.254.169.123 3 4 37 10 -154ns[-1612ns] +/- 484us
I thought that AMI base image (Ubuntu 22.04 minimal) does not have new NTP setting (169.254.169.123), but it was wrong. /etc/chrony/conf.d/00-cpc.conf is the file to overwrite default configuration, change the server address to 169.254.169.123. I cloud verify chrony is referencing to 169.254.169.123 (tested on Ubuntu 22.04 minimal AMI):
$ chronyc tracking
Reference ID : A9FEA97B (169.254.169.123)
Stratum : 4
Ref time (UTC) : Thu Apr 13 14:21:31 2023
System time : 0.000001611 seconds fast of NTP time
Last offset : +0.000002690 seconds
RMS offset : 0.000010544 seconds
Frequency : 35.703 ppm fast
Residual freq : +0.003 ppm
Skew : 0.129 ppm
Root delay : 0.000513758 seconds
Root dispersion : 0.000270585 seconds
Update interval : 16.2 seconds
Leap status : Normal
So I think we actually don't need a patch to change NTP server address.
But passing --no-ntp-setup
to scylla_setup
is still correct, since base image already has optimal NTP settings and we don't need to change that.
I thought that AMI base image (Ubuntu 22.04 minimal) does not have new NTP setting (169.254.169.123), but it was wrong. /etc/chrony/conf.d/00-cpc.conf is the file to overwrite default configuration, change the server address to 169.254.169.123. I cloud verify chrony is referencing to 169.254.169.123 (tested on Ubuntu 22.04 minimal AMI):
$ chronyc tracking Reference ID : A9FEA97B (169.254.169.123) Stratum : 4 Ref time (UTC) : Thu Apr 13 14:21:31 2023 System time : 0.000001611 seconds fast of NTP time Last offset : +0.000002690 seconds RMS offset : 0.000010544 seconds Frequency : 35.703 ppm fast Residual freq : +0.003 ppm Skew : 0.129 ppm Root delay : 0.000513758 seconds Root dispersion : 0.000270585 seconds Update interval : 16.2 seconds Leap status : Normal
So I think we actually don't need a patch to change NTP server address. But passing
--no-ntp-setup
toscylla_setup
is still correct, since base image already has optimal NTP settings and we don't need to change that.
@syuu1228 by passing only --no-ntp-setup
i get the following output:
scyllaadm@ip-10-99-17-184:~$ chronyc sources
MS Name/IP address Stratum Poll Reach LastRx Last sample
===============================================================================
^* 169.254.169.123 3 4 37 14 +14us[ -219us] +/- 619us
^- prod-ntp-5.ntp4.ps5.cano> 2 6 17 30 +543us[ +287us] +/- 41ms
^- prod-ntp-3.ntp1.ps5.cano> 2 6 17 30 +558us[ +324us] +/- 41ms
^- prod-ntp-4.ntp4.ps5.cano> 2 6 17 31 +287us[ +31us] +/- 37ms
^- pugot.canonical.com 2 6 17 31 +712us[ +456us] +/- 73ms
^- c-24-4-159-115.hsd1.ca.c> 1 6 17 31 +7198us[+6942us] +/- 49ms
^- 208.67.72.50 3 6 17 31 +4113us[+3858us] +/- 109ms
^- c-73-61-36-59.hsd1.nh.co> 3 6 17 41 +1991us[+1506us] +/- 37ms
^- time.richiemcintosh.com 2 6 17 40 +1542us[+1057us] +/- 50ms
So yes , the default is good, but we have a lot of other sources which we don't need. (https://jenkins.scylladb.com/job/scylla-master/job/releng-testing/job/next-machine-image/81/)
@syuu1228 by passing only
--no-ntp-setup
i get the following output:scyllaadm@ip-10-99-17-184:~$ chronyc sources MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^* 169.254.169.123 3 4 37 14 +14us[ -219us] +/- 619us ^- prod-ntp-5.ntp4.ps5.cano> 2 6 17 30 +543us[ +287us] +/- 41ms ^- prod-ntp-3.ntp1.ps5.cano> 2 6 17 30 +558us[ +324us] +/- 41ms ^- prod-ntp-4.ntp4.ps5.cano> 2 6 17 31 +287us[ +31us] +/- 37ms ^- pugot.canonical.com 2 6 17 31 +712us[ +456us] +/- 73ms ^- c-24-4-159-115.hsd1.ca.c> 1 6 17 31 +7198us[+6942us] +/- 49ms ^- 208.67.72.50 3 6 17 31 +4113us[+3858us] +/- 109ms ^- c-73-61-36-59.hsd1.nh.co> 3 6 17 41 +1991us[+1506us] +/- 37ms ^- time.richiemcintosh.com 2 6 17 40 +1542us[+1057us] +/- 50ms
So yes , the default is good, but we have a lot of other sources which we don't need. (https://jenkins.scylladb.com/job/scylla-master/job/releng-testing/job/next-machine-image/81/)
Okay, but Time sync guide on AWS doesn't say removing existing pool
entries, it says just add server 169.254.169.123
.
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/set-time.html
Also Red Hat document says:
It is NOT recommended to use only two NTP servers. If more than one NTP server is required, four NTP servers is the recommended minimum. Four servers protect against one incorrect timesource, or "falseticker". https://access.redhat.com/solutions/58025
So probably single server configuration is not good.
Although, I found that AWS document also says they have their own NTP pool, it can use it by adding following entry:
pool time.aws.com iburst
So if we don't want to use default public NTP pools, probably we can switch to 169.254.169.123 and time.aws.com.
I commended out all pool
entries on chrony.conf and added pool time.aws.com iburst
, it works like this:
$ chronyc sources
MS Name/IP address Stratum Poll Reach LastRx Last sample
===============================================================================
^* 169.254.169.123 3 4 177 2 -5090ns[ -12us] +/- 525us
^- ec2-34-201-171-241.compu> 4 6 17 50 +22us[ +26us] +/- 884us
^- ec2-18-212-60-76.compute> 4 6 17 51 -24us[ -16us] +/- 641us
^- ec2-34-229-185-123.compu> 4 6 17 51 +11us[ +19us] +/- 870us
^- ec2-3-85-98-105.compute-> 4 6 17 50 -7617ns[-3629ns] +/- 614us
And modifying chrony.conf can be done something like this:
with open('/etc/chrony/chrony.conf') as f:
chrony_conf = f.read()
chrony_conf = re.sub(r'^(pool .*$)', '# \\1', chrony_conf, flags=re.MULTILINE)
with open('/etc/chrony/chrony.conf', 'w') as f:
f.write(chrony_conf)
with open('/etc/chrony/sources.d/ntp-pool.sources', 'w') as f:
f.write('pool time.aws.com iburst\n')
@syuu1228 https://jenkins.scylladb.com/job/scylla-master/job/releng-testing/job/next-machine-image/93/ verified after the latest changes
scyllaadm@ip-10-99-17-8:~$ chronyc sources
MS Name/IP address Stratum Poll Reach LastRx Last sample
===============================================================================
^* 169.254.169.123 3 4 377 13 +18us[ +21us] +/- 543us
^- ec2-54-234-209-141.compu> 4 6 77 12 +26us[ +26us] +/- 743us
^- ec2-34-229-185-123.compu> 4 6 77 13 +65us[ +68us] +/- 1065us
^- ec2-34-201-171-241.compu> 4 6 77 13 +26us[ +26us] +/- 1051us
^- ec2-3-85-98-105.compute-> 4 6 77 12 +10us[ +10us] +/- 672us
scyllaadm@ip-10-99-17-8:~$ cat /etc/chrony/
chrony.conf chrony.keys conf.d/ sources.d/
scyllaadm@ip-10-99-17-8:~$ cat /etc/chrony/sources.d/
README ntp-pool.sources
scyllaadm@ip-10-99-17-8:~$ cat /etc/chrony/sources.d/ntp-pool.sources
pool time.aws.com iburst
rebased
disabling ntp configuration during image creation so we will use the default cloud recommended configuration
Closes: https://github.com/scylladb/scylladb/issues/13344