Closed maxadamo closed 5 years ago
This is my configuration -> https://pastebin.com/yS6RCMpL
verbose show few more logs:
# syncflux -config /etc/influxdb/syncflux.toml -action copy -vvv
INFO[2019-07-22 11:45:47] CFG :&{General:{InstanceID: LogDir:/var/log/influxdb-srelay HomeDir: DataDir: LogLevel:debug SyncMode:onlyslave CheckInterval:10s MinSyncInterval:20s MasterDB:prod-sensu01 SlaveDB:prod-sensu03 InitialReplication:none MonitorRetryInterval:1m0s DataChunkDuration:5m0s MaxRetentionInterval:8760h0m0s RWMaxRetries:5 RWRetryDelay:10s NumWorkers:4 MaxPointsOnSingleWrite:20000} HTTP:{BindAddr:0.0.0.0:4090 AdminUser:admin AdminPassword:admin CookieID:mysupercokie} InfluxArray:[0xc42001f320 0xc42001f440 0xc42001f560]}
INFO[2019-07-22 11:45:47] Set Master DB prod-sensu01 from Command Line parameters
INFO[2019-07-22 11:45:47] Set Slave DB prod-sensu03 from Command Line parameters
INFO[2019-07-22 11:45:47] Set Default directories :
- Exec: /etc/influxdb
- Config: /etc/influxdb
-Logs: /etc/influxdb/log
INFO[2019-07-22 11:45:47] Initializing cluster
INFO[2019-07-22 11:45:47] Found MasterDB[prod-sensu01] in config File &{Release:1x Name:prod-sensu01 Location:http://prod-sensu01.geant.org:8086/ AdminUser: AdminPasswd: Timeout:10s}
TRAC[2019-07-22 11:45:47] SHOW DATABASES On InitPint: [{Series:[{Name:databases Tags:map[] Columns:[name] Values:[[_internal] [sensu] [nmaas]] Partial:false}] Messages:[] Err:}]
INFO[2019-07-22 11:45:47] Found SlaveDB[prod-sensu03] in config File &{Release:1x Name:prod-sensu03 Location:http://prod-sensu03.geant.org:8086/ AdminUser: AdminPasswd: Timeout:10s}
TRAC[2019-07-22 11:45:47] SHOW DATABASES On InitPint: [{Series:[{Name:databases Tags:map[] Columns:[name] Values:[[_internal] [sensu] [nmaas]] Partial:false}] Messages:[] Err:}]
DEBU[2019-07-22 11:45:47] discovered database 0: [_internal]
DEBU[2019-07-22 11:45:47] discovered database 1: [sensu]
DEBU[2019-07-22 11:45:47] discovered database 2: [nmaas]
DEBU[2019-07-22 11:45:47] discovered retention Policies 0: 5 : []interface {}{"autogen", "0s", "168h0m0s", "1", false}
DEBU[2019-07-22 11:45:47] discovered retention Policies 1: 5 : []interface {}{"30days", "720h0m0s", "24h0m0s", "1", true}
DEBU[2019-07-22 11:45:47] discovered measurement &{23755prod-galera01 map[]} on DB: sensu-RP:autogen
panic: runtime error: index out of range
goroutine 1 [running]:
github.com/toni-moreno/syncflux/pkg/agent.GetFields(0xa03e20, 0xc4201fa0c0, 0xc420099950, 0x5, 0xc42009b760, 0x12, 0xc420099ab7, 0x7, 0xc420248e00)
/home/maxadamo/go/src/github.com/toni-moreno/syncflux/pkg/agent/client.go:254 +0x6d2
github.com/toni-moreno/syncflux/pkg/agent.(*HACluster).GetSchema(0xc4201e0c30, 0x995885, 0x2, 0x995885, 0x2, 0x995885, 0x2, 0x4e372a, 0x9bcc40, 0xc4201add90, ...)
/home/maxadamo/go/src/github.com/toni-moreno/syncflux/pkg/agent/hacluster.go:147 +0x548
github.com/toni-moreno/syncflux/pkg/agent.Copy(0xc420022530, 0xc, 0xc420022630, 0xc, 0x995885, 0x2, 0x0, 0x0, 0x995885, 0x2, ...)
/home/maxadamo/go/src/github.com/toni-moreno/syncflux/pkg/agent/agent.go:214 +0xd8
main.main()
/home/maxadamo/go/src/github.com/toni-moreno/syncflux/pkg/main.go:291 +0x356
is it possible that this is arising because I am replicating between 3 nodes, instead of 2?
Sorry @maxadamo I was outside enjoing my holidays. I will review ASAP this issue.
I have an update from few minutes ago: in my acceptance environment, with 3 influx servers it's working. Hence the problem is not the number 3.
@toni-moreno I finally understood where the problem is: the hyphen in the name of my measurements.
If I create a measurement with a hyphen in the name, syncflux won't start.
In my understanding, if syncflux is already running and I create the measurement with the hyphen, then it does not crash.
In my case every measurement is a hostname, hence, I have hypens on each and every measurement.
I have tried to replace -
with _
and it started working, but I think, it will be quite common to see measurements with such characters.
unfortunately the above PR does not suffice as I see this the the same issue in another line:
cmd := "ALTER RETENTION POLICY \"" + rp.Name + "\" ON " + db + " DEFAULT"
This change must probably be done by you, because you know your code well.
Hi @maxadamo I suggest you to make a PR and @sbengo and myself will review and complete the code.
Thank you for your contibution !!!
resolved by https://github.com/toni-moreno/syncflux/pull/24
@maxadamo I've released 0.6.5 version . with your PR. Could you update to test if all ok?
@toni-moreno I tested it right now, and it is NOT crashing :+1:
whichever action I choose (hamonitor, copy, replicaschema) I am not being able to start the application: