Closed kylezh closed 9 years ago
Please show all curl commands required to reproduce this issue, including creation of the database and retention policy.
On Feb 28, 2015, at 2:15 AM, "Kyle Zhang(zelin.io)" notifications@github.com wrote:
Way to reproduce the problem:
Start a cluster with nodes A,B,C Write one record. Kill influxdb in nodeA. C becomes leader. Restart influxdb in nodeA. It rejoins the cluster as follower. Write a second record. The second record could only be selected from nodeB and nodeC, but not nodeA. Query result from nodeA only contains the first record.
— Reply to this email directly or view it on GitHub.
I don't know if my problem is the same as @kylezh .
I'm running a single influxdb instance within a docker container. no volume and with static hostname,
docker run -d -p 80:80 -p 8083:8083 -p 8084:8084 -p 8086:8086 -p 9022:22 --name="influxdb" --hostname="influxdb" grafana_influxdb /usr/bin/supervisord
config.toml
bind-address = "0.0.0.0"
reporting-disabled = false
[initialization]
join-urls = ""
[authentication]
enabled = false
[admin]
enabled = true
port = 8083
[api]
[[graphite]]
enabled = false
[collectd]
enabled = false
[udp]
enabled = false
[broker]
dir = "/tmp/influxdb/development/raft"
port = 8086
[data]
dir = "/tmp/influxdb/development/db"
port = 8086
retention-check-enabled = false
retention-check-period = "10m"
[cluster]
dir = "/tmp/influxdb/development/state"
[logging]
file = "/var/log/influxdb/influxd.log"
initially everything was fine. created database metrics
. and retentionpolicy p1
. data can write into influxdb.
root@influxdb:/opt/influxdb# ./influx
InfluxDB shell 0.9.0-rc7
Connected to http://localhost:8086 version 0.9.0-rc7
> show databases
name tags name
---- ---- ----
metrics
> use metrics
Using database metrics
> select Count(Value) from metric1.1
name tags time Count
---- ---- ---- -----
metric1.1 1970-01-01T00:00:00Z 46
>
2015/03/10 17:33:57 [1970-01-01T00:00:00Z 38] # NOTICE, results of query: "select Count(Value) from metric1.1"
2015/03/10 17:33:57 Fast-PING: response_time: 4ms, influxdb version: 0.9.0-rc7, queue length: 0, buf size: 100
2015/03/10 17:33:58 Fast-PING: response_time: 2ms, influxdb version: 0.9.0-rc7, queue length: 0, buf size: 100
<- tick ----------------------
2015/03/10 17:33:58 * md len:100 [influxdb] consuming: boltQ len: 0 , mdCh len: 0, buf size: 0
2015/03/10 17:33:58 Fast-PING: response_time: 5ms, influxdb version: 0.9.0-rc7, queue length: 0, buf size: 100
2015/03/10 17:33:59 Fast-PING: response_time: 2ms, influxdb version: 0.9.0-rc7, queue length: 0, buf size: 100
<- tick ----------------------
2015/03/10 17:33:59 Fast-PING: response_time: 2ms, influxdb version: 0.9.0-rc7, queue length: 1, buf size: 0
2015/03/10 17:34:00 Fast-PING: response_time: 1ms, influxdb version: 0.9.0-rc7, queue length: 1, buf size: 0
<- tick ----------------------
2015/03/10 17:34:00 - md len:200 [influxdb] backfilling:, boltQ len: 0
2015/03/10 17:34:00 [1970-01-01T00:00:00Z 41]
2015/03/10 17:34:00 Fast-PING: response_time: 5ms, influxdb version: 0.9.0-rc7, queue length: 0, buf size: 100
2015/03/10 17:34:01 Fast-PING: response_time: 2ms, influxdb version: 0.9.0-rc7, queue length: 0, buf size: 100
<- tick ----------------------
2015/03/10 17:34:01 - md len:100 [influxdb] backfilling:, boltQ len: 0
2015/03/10 17:34:01 [1970-01-01T00:00:00Z 43] #NOTICE: increasing.
But after I _kill container and start it again_. everything seems to work except I _cannot write data into influxdb anymore_.
2015/03/10 17:35:51 [1970-01-01T00:00:00Z 46] # NOTICE: this number won't increase.
2015/03/10 17:35:51 Fast-PING: response_time: 4ms, influxdb version: 0.9.0-rc7, queue length: 8, buf size: 0
2015/03/10 17:35:52 Fast-PING: response_time: 1ms, influxdb version: 0.9.0-rc7, queue length: 8, buf size: 0
<- tick ----------------------
2015/03/10 17:35:52 - md len:100 [influxdb] backfilling:, boltQ len: 7
2015/03/10 17:35:52 [1970-01-01T00:00:00Z 46]
2015/03/10 17:35:52 Fast-PING: response_time: 11ms, influxdb version: 0.9.0-rc7, queue length: 7, buf size: 100
2015/03/10 17:35:53 Fast-PING: response_time: 2ms, influxdb version: 0.9.0-rc7, queue length: 7, buf size: 100
<- tick ----------------------
2015/03/10 17:35:53 * md len:100 [influxdb] consuming: boltQ len: 7 , mdCh len: 0, buf size: 0
2015/03/10 17:35:53 - md len:200 [influxdb] backfilling:, boltQ len: 6
2015/03/10 17:35:53 [1970-01-01T00:00:00Z 46]
2015/03/10 17:35:53 Fast-PING: response_time: 6ms, influxdb version: 0.9.0-rc7, queue length: 6, buf size: 100
<- tick ----------------------
2015/03/10 17:35:54 Fast-PING: response_time: 7ms, influxdb version: 0.9.0-rc7, queue length: 5, buf size: 121
2015/03/10 17:35:54 - md len:200 [influxdb] backfilling:, boltQ len: 5
2015/03/10 17:35:54 [1970-01-01T00:00:00Z 46]
and what's more. _no error raised_:
func writeToInfluxdb(XXXX) {
// ...
res, err := w.cli.Write(write)
if err != nil {
// log.Println(res, err)
//TODO: remove this log.Println
log.Println(" -E- writeMD failed: ", err)
return err
}
if res != nil && res.Err != nil {
log.Println(" -E- writeMD failed: res.Err: ", res.Err)
return fmt.Errorf("res.Err: %s", res.Err)
}
return err
}
write data with web GUI also don't work.
please provide a repro case with latest RC
Way to reproduce the problem:
influxdb
in nodeA. C becomes leader.influxdb
in nodeA. It rejoins the cluster as follower.The second record could only be selected from nodeB and nodeC, but not nodeA. Query result from nodeA only contains the first record.