edgexfoundry / performance-test

Owner: QA-Testing WG
4 stars 9 forks source link

InfluxDB may lost most of Jmeter request data #31

Closed cherrycl closed 5 years ago

cherrycl commented 5 years ago

When queried jmeter request from Grafana, we found lots of data lost. We couldn't find related error from influxdb log. We will ask for influxdb configuration from LF and try to recreate the issue on our environment.

cloudxxx8 commented 5 years ago

We have checked the data files on LF InfluxDB and confirmed the data is missing. Also, we found out there is no data missing when running performance test on June 1st. This issue doesn't always happen.

After investigating the log, there are some suspicious error we would like to look into:

We are asking for more log to confirm whether those errors happening during the data lost period

cherrycl commented 5 years ago

Compared to the 2 runs, 5/29 and 6/15, which had data loss issues, and found the following same behavior.

  1. The "File corrupt" message came up after the "InfluxDB starting" message shows up. The "InfluxDB starting" always display on influxDB service startup.
  2. There are several "fatal error" messages that occurred on the log. I guess the fatal error causes the influxDB to crashed.

The message of fatal error on 5/29 runs:

unexpected fault address 0x7f700fedc78c fatal error: fault [signal SIGBUS: bus error code=0x2 addr=0x7f700fedc78c pc=0xf63238]

The message of fatal error on 6/12 runs:

fatal error: runtime: out of memory

cherrycl commented 5 years ago

There are 2 runs on a couple of days ago, one contains full data, and the other one has data missing. Requested logs on LF help desk and still waiting for response.

cherrycl commented 5 years ago

Received the influxdb logs today, but no "File corrupt" message found. We would like to create an identical VM to reproduce this issue. Requested the detail information of TIG server by ticket IT-16584 and waiting for the response.

cloudxxx8 commented 5 years ago

only a little data lost in the latest performance test run in last weekend. the situation is unstable.

cherrycl commented 5 years ago

After reporting LF ticket IT-16630 to request how to build TIG Server instruction, and we got server configuration files.

cherrycl commented 5 years ago

The last 5 runs were no data loss after updating the TIG infrastructure, closed issue. If the problem occurs again, we will open a new issue.