toni-moreno / syncflux

SyncFlux is an Open Source InfluxDB Data synchronization and replication tool for migration purposes or HA clusters
MIT License
154 stars 34 forks source link

panic: runtime error: index out of range. Caused by hypens in measurement name #22

Closed maxadamo closed 5 years ago

maxadamo commented 5 years ago

whichever action I choose (hamonitor, copy, replicaschema) I am not being able to start the application:

INFO[2019-07-19 13:33:19] CFG :&{General:{InstanceID: LogDir:/var/log/syncflux HomeDir: DataDir: LogLevel:trace SyncMode:onlyslave CheckInterval:10s MinSyncInterval:20s MasterDB:sensu02 SlaveDB:sensu01 InitialReplication:none MonitorRetryInterval:1m0s DataChunkDuration:5m0s MaxRetentionInterval:8760h0m0s RWMaxRetries:5 RWRetryDelay:10s NumWorkers:4 MaxPointsOnSingleWrite:20000} HTTP:{BindAddr:83.97.94.46:4090 AdminUser:admin AdminPassword:admin CookieID:mysupercokie} InfluxArray:[0xc4201ed260 0xc4201ed320 0xc4201ed3e0]} 
panic: runtime error: index out of range

goroutine 1 [running]:
github.com/toni-moreno/syncflux/pkg/agent.GetFields(0xa03e20, 0xc4201f2180, 0xc4202371a0, 0x5, 0xc420237600, 0x10, 0xc4202372f7, 0x7, 0xc420250e00)
    /home/maxadamo/go/src/github.com/toni-moreno/syncflux/pkg/agent/client.go:254 +0x6d2
github.com/toni-moreno/syncflux/pkg/agent.(*HACluster).GetSchema(0xc42029a410, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc420195d90, 0x42bdf4, 0x9bcc40, ...)
    /home/maxadamo/go/src/github.com/toni-moreno/syncflux/pkg/agent/hacluster.go:147 +0x548
github.com/toni-moreno/syncflux/pkg/agent.HAMonitorStart(0xc4201e2720, 0x7, 0xc4201e27c0, 0x7)
    /home/maxadamo/go/src/github.com/toni-moreno/syncflux/pkg/agent/agent.go:246 +0x9c
main.main()
    /home/maxadamo/go/src/github.com/toni-moreno/syncflux/pkg/main.go:288 +0x4c1
maxadamo commented 5 years ago

This is my configuration -> https://pastebin.com/yS6RCMpL

maxadamo commented 5 years ago

verbose show few more logs:

# syncflux -config /etc/influxdb/syncflux.toml -action copy -vvv
INFO[2019-07-22 11:45:47] CFG :&{General:{InstanceID: LogDir:/var/log/influxdb-srelay HomeDir: DataDir: LogLevel:debug SyncMode:onlyslave CheckInterval:10s MinSyncInterval:20s MasterDB:prod-sensu01 SlaveDB:prod-sensu03 InitialReplication:none MonitorRetryInterval:1m0s DataChunkDuration:5m0s MaxRetentionInterval:8760h0m0s RWMaxRetries:5 RWRetryDelay:10s NumWorkers:4 MaxPointsOnSingleWrite:20000} HTTP:{BindAddr:0.0.0.0:4090 AdminUser:admin AdminPassword:admin CookieID:mysupercokie} InfluxArray:[0xc42001f320 0xc42001f440 0xc42001f560]} 
INFO[2019-07-22 11:45:47] Set Master DB prod-sensu01 from Command Line parameters 
INFO[2019-07-22 11:45:47] Set Slave DB prod-sensu03 from Command Line parameters 
INFO[2019-07-22 11:45:47] Set Default directories : 
   - Exec: /etc/influxdb
   - Config: /etc/influxdb
   -Logs: /etc/influxdb/log 
INFO[2019-07-22 11:45:47] Initializing cluster                         
INFO[2019-07-22 11:45:47] Found MasterDB[prod-sensu01] in config File &{Release:1x Name:prod-sensu01 Location:http://prod-sensu01.geant.org:8086/ AdminUser: AdminPasswd: Timeout:10s} 
TRAC[2019-07-22 11:45:47] SHOW DATABASES On InitPint: [{Series:[{Name:databases Tags:map[] Columns:[name] Values:[[_internal] [sensu] [nmaas]] Partial:false}] Messages:[] Err:}] 
INFO[2019-07-22 11:45:47] Found SlaveDB[prod-sensu03] in config File &{Release:1x Name:prod-sensu03 Location:http://prod-sensu03.geant.org:8086/ AdminUser: AdminPasswd: Timeout:10s} 
TRAC[2019-07-22 11:45:47] SHOW DATABASES On InitPint: [{Series:[{Name:databases Tags:map[] Columns:[name] Values:[[_internal] [sensu] [nmaas]] Partial:false}] Messages:[] Err:}] 
DEBU[2019-07-22 11:45:47] discovered database 0: [_internal]           
DEBU[2019-07-22 11:45:47] discovered database 1: [sensu]               
DEBU[2019-07-22 11:45:47] discovered database 2: [nmaas]               
DEBU[2019-07-22 11:45:47] discovered retention Policies 0:  5 : []interface {}{"autogen", "0s", "168h0m0s", "1", false} 
DEBU[2019-07-22 11:45:47] discovered retention Policies 1:  5 : []interface {}{"30days", "720h0m0s", "24h0m0s", "1", true} 
DEBU[2019-07-22 11:45:47] discovered measurement  &{23755prod-galera01 map[]} on DB: sensu-RP:autogen 
panic: runtime error: index out of range

goroutine 1 [running]:
github.com/toni-moreno/syncflux/pkg/agent.GetFields(0xa03e20, 0xc4201fa0c0, 0xc420099950, 0x5, 0xc42009b760, 0x12, 0xc420099ab7, 0x7, 0xc420248e00)
    /home/maxadamo/go/src/github.com/toni-moreno/syncflux/pkg/agent/client.go:254 +0x6d2
github.com/toni-moreno/syncflux/pkg/agent.(*HACluster).GetSchema(0xc4201e0c30, 0x995885, 0x2, 0x995885, 0x2, 0x995885, 0x2, 0x4e372a, 0x9bcc40, 0xc4201add90, ...)
    /home/maxadamo/go/src/github.com/toni-moreno/syncflux/pkg/agent/hacluster.go:147 +0x548
github.com/toni-moreno/syncflux/pkg/agent.Copy(0xc420022530, 0xc, 0xc420022630, 0xc, 0x995885, 0x2, 0x0, 0x0, 0x995885, 0x2, ...)
    /home/maxadamo/go/src/github.com/toni-moreno/syncflux/pkg/agent/agent.go:214 +0xd8
main.main()
    /home/maxadamo/go/src/github.com/toni-moreno/syncflux/pkg/main.go:291 +0x356
maxadamo commented 5 years ago

is it possible that this is arising because I am replicating between 3 nodes, instead of 2?

toni-moreno commented 5 years ago

Sorry @maxadamo I was outside enjoing my holidays. I will review ASAP this issue.

maxadamo commented 5 years ago

I have an update from few minutes ago: in my acceptance environment, with 3 influx servers it's working. Hence the problem is not the number 3.

maxadamo commented 5 years ago

@toni-moreno I finally understood where the problem is: the hyphen in the name of my measurements. If I create a measurement with a hyphen in the name, syncflux won't start. In my understanding, if syncflux is already running and I create the measurement with the hyphen, then it does not crash. In my case every measurement is a hostname, hence, I have hypens on each and every measurement. I have tried to replace - with _ and it started working, but I think, it will be quite common to see measurements with such characters.

maxadamo commented 5 years ago

unfortunately the above PR does not suffice as I see this the the same issue in another line:

cmd := "ALTER RETENTION POLICY \"" + rp.Name + "\" ON " + db + " DEFAULT"

This change must probably be done by you, because you know your code well.

toni-moreno commented 5 years ago

Hi @maxadamo I suggest you to make a PR and @sbengo and myself will review and complete the code.

Thank you for your contibution !!!

maxadamo commented 5 years ago

resolved by https://github.com/toni-moreno/syncflux/pull/24

toni-moreno commented 5 years ago

@maxadamo I've released 0.6.5 version . with your PR. Could you update to test if all ok?

maxadamo commented 5 years ago

@toni-moreno I tested it right now, and it is NOT crashing :+1: