apache / skywalking

APM, Application Performance Monitoring System
https://skywalking.apache.org/
Apache License 2.0
23.84k stars 6.52k forks source link

SkyWalking 5.0.0-beta e2e test #1146

Closed ascrutae closed 6 years ago

ascrutae commented 6 years ago

Report @ 2018/05/03[Fixed]

the prefix description of Issue:

prefix description
[AGENT] This prefix represent that this issue is an agent issue
[COLLECTOR] This prefix represent that this issue is a collector issue
[UI] This prefix represent that this issue is an UI issue
[UNKNOWN] This prefix represent that everyone should pay attention to this issue
[APP_STANDBY] This prefix represent that this issue happened after server started
[APP_SERVICE_PROVIDED] This prefix represent that this issue happened after call service

Issue

Topology

Screen snapshot:

topology

request url:

http://localhost:8080/api/topology

request paramter:

{"variables":{"duration":{"start":"2018-05-02 13","end":"2018-05-03 13","step":"HOUR"}},"query":"\n    query Topology($duration: Duration!) {\n      getClusterTopology(duration: $duration) {\n        nodes {\n          id\n          name\n          type\n          ... on ApplicationNode {\n            sla\n            cpm\n            avgResponseTime\n            apdex\n            isAlarm\n            numOfServer\n            numOfServerAlarm\n            numOfServiceAlarm\n          }\n        }\n        calls {\n          source\n          target\n          isAlert\n          callType\n          cpm\n          avgResponseTime\n        }\n      }\n    }\n  "}

response body:

{"data":{"getClusterTopology":{"nodes":[{"id":"-1","name":"persistence-service","type":"Dubbo","sla":100,"cpm":0,"avgResponseTime":1772,"apdex":100,"isAlarm":false,"numOfServer":1,"numOfServerAlarm":0,"numOfServiceAlarm":0},{"id":"2","name":"cache-service","type":"Motan","sla":100,"cpm":0,"avgResponseTime":323,"apdex":100,"isAlarm":false,"numOfServer":1,"numOfServerAlarm":0,"numOfServiceAlarm":0},{"id":"3","name":"portal-service","type":"SpringMVC","sla":100,"cpm":0,"avgResponseTime":4947,"apdex":0,"isAlarm":false,"numOfServer":1,"numOfServerAlarm":0,"numOfServiceAlarm":0},{"id":"-5","name":"localhost:27017","type":"MongoDB"},{"id":"-3","name":"127.0.0.1:6379","type":"Redis"},{"id":"-2","name":"localhost:-1","type":"H2"},{"id":"4","name":"127.0.0.1:3307","type":"Mysql"},{"id":"1","name":"User","type":"USER"}],"calls":[{"source":"2","target":"-5","isAlert":false,"callType":"MongoDB","cpm":0,"avgResponseTime":60},{"source":"2","target":"-3","isAlert":false,"callType":"Redis","cpm":0,"avgResponseTime":1},{"source":"2","target":"-2","isAlert":false,"callType":"H2","cpm":0,"avgResponseTime":1},{"source":"3","target":"2","isAlert":false,"callType":"Motan","cpm":0,"avgResponseTime":390},{"source":"3","target":"-1","isAlert":false,"callType":"Dubbo","cpm":0,"avgResponseTime":3046},{"source":"-1","target":"4","isAlert":false,"callType":"Mysql","cpm":0,"avgResponseTime":3},{"source":"1","target":"3","isAlert":false,"callType":"","cpm":0,"avgResponseTime":4947}]}}}

Notice Request url:

http://localhost:8080/api/notice

request parameter:

{"query":"\n  query Notice($duration:Duration!){\n    applicationAlarmList: loadAlarmList(alarmType: APPLICATION, duration: $duration, paging: { pageNum: 1, pageSize: 5, needTotal: true }) {\n      items {\n        title\n        startTime\n        causeType\n      }\n      total\n    }\n    serverAlarmList: loadAlarmList(alarmType: SERVER, duration: $duration, paging: { pageNum: 1, pageSize: 5, needTotal: true }) {\n      items {\n        title\n        startTime\n        causeType\n      }\n      total\n    }\n  }\n","variables":{"duration":{"start":"2018-05-02 13","end":"2018-05-03 13","step":"HOUR"}}}

response body:

{"data":{"applicationAlarmList":{"items":[{"title":"Application portal-service response time alarm.","startTime":"2018-05-03 13:22","causeType":"SLOW_RESPONSE"},{"title":"Application 192.168.5.38:20880 response time alarm.","startTime":"2018-05-03 13:22","causeType":"SLOW_RESPONSE"}],"total":2},"serverAlarmList":{"items":[{"title":"Server ascrutae of Application portal-service response time alarm.","startTime":"2018-05-03 13:22","causeType":"SLOW_RESPONSE"}],"total":1}}}

Server

Screen snapshot:

server

Request URL:

http://localhost:8080/api/server

request parameter:

{"variables":{"duration":{"start":"2018-05-03 1317","end":"2018-05-03 1332","step":"MINUTE"},"serverId":"2"},"query":"\nquery Application($serverId: ID!, $duration: Duration!) {\n  getServerResponseTimeTrend(serverId: $serverId, duration: $duration) {\n    trendList\n  }\n  getServerThroughputTrend(serverId: $serverId, duration: $duration) {\n    trendList\n  }\n  getCPUTrend(serverId: $serverId, duration: $duration) {\n    cost\n  }\n  getGCTrend(serverId: $serverId, duration: $duration) {\n    youngGCCount\n    oldGCount\n    youngGCTime\n    oldGCTime\n  }\n  getMemoryTrend(serverId: $serverId, duration: $duration) {\n    heap\n    maxHeap\n    noheap\n    maxNoheap\n  }\n}\n"}

response body:

{"data":{"getServerResponseTimeTrend":{"trendList":[0,0,0,0,0,1772,0,0,0,0,0,0,0,0,0,0]},"getServerThroughputTrend":{"trendList":[0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0]},"getCPUTrend":{"cost":[0,0,6,0,0,0,0,0,0,0,0,0,0,0,0,0]},"getGCTrend":{"youngGCCount":[0,0,14,14,14,14,16,16,16,16,16,16,16,16,16,16],"oldGCount":[0,0,-14,-14,-14,-14,-16,-16,-16,-16,-16,-16,-16,-16,-16,-16],"youngGCTime":[0,0,4,-16,-16,-9,9,9,9,9,9,9,9,9,9,9],"oldGCTime":[0,0,16,16,16,9,-9,-9,-9,-9,-9,-9,-9,-9,-9,-9]},"getMemoryTrend":{"heap":[0,0,166238547,171449144,175401897,190184972,234289253,239443707,242472644,244873871,247501716,250321448,252839238,255130268,257479694,259835858],"maxHeap":[0,0,545409316,63631086,63631086,63631086,63631086,63631086,63631086,63631086,63631086,63631086,63631086,63631086,63631086,141402415],"noheap":[0,0,62165760,62366768,62889268,66356143,75470034,75332149,75692865,75985991,76410386,76940723,77078423,77324614,77285820,77353330],"maxNoheap":[0,0,62165760,62366768,62889268,66356143,75470034,75332149,75692865,75985991,76410386,76940723,77078423,77324614,77285820,77353330]}}}

Application

Screen snapshot:

application

request url:

http://localhost:8080/api/application

request paramter:

{"variables":{"duration":{"start":"2018-05-03 1311","end":"2018-05-03 1326","step":"MINUTE"},"applicationId":"2"},"query":"\n  query Application($applicationId: ID!, $duration: Duration!) {\n    getSlowService(applicationId: $applicationId, duration: $duration, topN: 10) {\n      key: id\n      label: name\n      value: avgResponseTime\n    }\n    getServerThroughput(applicationId: $applicationId, duration: $duration, topN: 999999) {\n      key: id\n      osName\n      host\n      pid\n      ipv4\n      value: cpm\n    }\n    getApplicationTopology(applicationId: $applicationId, duration: $duration) {\n      nodes {\n        id\n        name\n        type\n        ... on ApplicationNode {\n          sla\n          cpm\n          avgResponseTime\n          apdex\n          isAlarm\n          numOfServer\n          numOfServerAlarm\n          numOfServiceAlarm\n        }\n      }\n      calls {\n        source\n        target\n        isAlert\n        callType\n        cpm\n        avgResponseTime\n      }\n    }\n  }\n"}

response body:

{"data":{"getSlowService":[{"key":"-7","label":"com.a.eye.skywalking.test.cache.CacheService.updateCache(java.lang.String,java.lang.String)","value":550},{"key":"-3","label":"com.a.eye.skywalking.test.cache.CacheService.findCache(java.lang.String)","value":0}],"getServerThroughput":[{"key":"3","osName":"Mac OS X","host":"ascrutae","pid":13452,"ipv4":["192.168.5.38"],"value":0}],"getApplicationTopology":{"nodes":[{"id":"2","name":"cache-service","type":"Motan","sla":100,"cpm":0,"avgResponseTime":323,"apdex":100,"isAlarm":true,"numOfServer":1,"numOfServerAlarm":1,"numOfServiceAlarm":2},{"id":"3","name":"portal-service","type":"SpringMVC","sla":100,"cpm":0,"avgResponseTime":4947,"apdex":0,"isAlarm":true,"numOfServer":1,"numOfServerAlarm":1,"numOfServiceAlarm":2},{"id":"-2","name":"localhost:-1","type":"H2"},{"id":"-5","name":"localhost:27017","type":"MongoDB"},{"id":"-3","name":"127.0.0.1:6379","type":"Redis"}],"calls":[{"source":"2","target":"-2","isAlert":false,"callType":"H2","cpm":1,"avgResponseTime":1},{"source":"2","target":"-5","isAlert":false,"callType":"MongoDB","cpm":0,"avgResponseTime":60},{"source":"2","target":"-3","isAlert":false,"callType":"Redis","cpm":0,"avgResponseTime":1},{"source":"3","target":"2","isAlert":false,"callType":"Motan","cpm":0,"avgResponseTime":390}]}}}

ES Data: data.zip

wu-sheng commented 6 years ago

FYI @liuhaoyang @candyleer we have started the test for beta release. Welcome to join me. @ascrutae has reported some issues,

wu-sheng commented 6 years ago

@peng-yongsheng @hanahmily @ascrutae Topology issue is caused by a complex scenario: didn't use mapping to transform alarm. I will talk with @hanahmily and @peng-yongsheng about new design to this sub page.

candyleer commented 6 years ago

got it

liuhaoyang commented 6 years ago

Report @ 2018/05/03 [Fixed]

Issue

Topology

Solution

https://github.com/apache/incubator-skywalking-ui/pull/166

hanahmily commented 6 years ago

Fix UI issues in https://github.com/apache/incubator-skywalking/pull/1155

wu-sheng commented 6 years ago

@ascrutae @liuhaoyang All known and reported issues have been declared fixed. Please run the test.

FYI @peng-yongsheng has one or two features to be done.

candyleer commented 6 years ago

Report @ 2018/05/03[Fixed]

in dashboard ,when i turn 15 minutes to 30 minutes

candyleer commented 6 years ago

Report @ 2018/05/03 [Fixed]

I am trying to pinpoint this issue

Fixed in #1164

wu-sheng commented 6 years ago

@candyleer maybe you could pinpoint the issue? or provide the es data, like @ascrutae did in his report.

liuhaoyang commented 6 years ago

Report @ 2018/05/03

Issue

Topology

wu-sheng commented 6 years ago

@hanahmily Maybe you should take care this, I remember I mentioned this, before.

hanahmily commented 6 years ago

I have submitted an issue for this problem. https://github.com/apache/incubator-skywalking/issues/1160

candyleer commented 6 years ago

Report @ 2018/05/04 [Fixed]

Issue

How to fix?

remove the judgement in ApplicationComponentSpanListener image

Fixed inside #1163

wu-sheng commented 6 years ago

@candyleer your blocking should be removed by now.

wu-sheng commented 6 years ago

@liuhaoyang #1165 should fix your server type issue, but need your test case to recheck.

ascrutae commented 6 years ago

Report @ 2018/05/04[Ignored]

Issue

Topology

screensnapshot

topology-server

Request URL: http://localhost:8080/api/topology

Request param:

{"variables":{"duration":{"start":"2018-05-04 1713","end":"2018-05-04 1728","step":"MINUTE"}},"query":"\n    query Topology($duration: Duration!) {\n      getClusterTopology(duration: $duration) {\n        nodes {\n          id\n          name\n          type\n          ... on ApplicationNode {\n            sla\n            cpm\n            avgResponseTime\n            apdex\n            isAlarm\n            numOfServer\n            numOfServerAlarm\n            numOfServiceAlarm\n          }\n        }\n        calls {\n          source\n          target\n          isAlert\n          callType\n          cpm\n          avgResponseTime\n        }\n      }\n    }\n  "}

Response Data:

{"data":{"getClusterTopology":{"nodes":[{"id":"-2","name":"portal-service","type":"Tomcat","sla":99,"cpm":150,"avgResponseTime":3514,"apdex":88,"isAlarm":true,"numOfServer":1,"numOfServerAlarm":2,"numOfServiceAlarm":5},{"id":"-1","name":"persistence-service","type":"Dubbo","sla":100,"cpm":147,"avgResponseTime":1,"apdex":100,"isAlarm":true,"numOfServer":1,"numOfServerAlarm":2,"numOfServiceAlarm":5},{"id":"2","name":"cache-service","type":"Motan","sla":100,"cpm":299,"avgResponseTime":556,"apdex":98,"isAlarm":true,"numOfServer":1,"numOfServerAlarm":2,"numOfServiceAlarm":5},{"id":"-5","name":"localhost:27017","type":"MongoDB"},{"id":"-3","name":"127.0.0.1:6379","type":"Redis"},{"id":"3","name":"localhost:-1","type":"H2"},{"id":"4","name":"127.0.0.1:3307","type":"Mysql"},{"id":"1","name":"User","type":"USER"}],"calls":[{"source":"2","target":"-5","isAlert":false,"callType":"MongoDB","cpm":296,"avgResponseTime":117},{"source":"2","target":"-3","isAlert":false,"callType":"Redis","cpm":298,"avgResponseTime":1},{"source":"2","target":"3","isAlert":false,"callType":"H2","cpm":597,"avgResponseTime":0},{"source":"-2","target":"2","isAlert":false,"callType":"Motan","cpm":300,"avgResponseTime":550},{"source":"-2","target":"-1","isAlert":false,"callType":"Dubbo","cpm":150,"avgResponseTime":257},{"source":"-1","target":"4","isAlert":false,"callType":"Mysql","cpm":295,"avgResponseTime":0},{"source":"1","target":"-2","isAlert":false,"callType":"","cpm":149,"avgResponseTime":3511}]}}}

request URL: http://localhost:8080/api/notice

request param:

{"query":"\n  query Notice($duration:Duration!){\n    applicationAlarmList: loadAlarmList(alarmType: APPLICATION, duration: $duration, paging: { pageNum: 1, pageSize: 5, needTotal: true }) {\n      items {\n        title\n        startTime\n        causeType\n      }\n      total\n    }\n    serverAlarmList: loadAlarmList(alarmType: SERVER, duration: $duration, paging: { pageNum: 1, pageSize: 5, needTotal: true }) {\n      items {\n        title\n        startTime\n        causeType\n      }\n      total\n    }\n  }\n","variables":{"duration":{"start":"2018-05-04 1713","end":"2018-05-04 1728","step":"MINUTE"}}}

Response body:

{"data":{"applicationAlarmList":{"items":[{"title":"Application cache-service response time alarm.","startTime":"2018-05-04 17:21","causeType":"SLOW_RESPONSE"},{"title":"Application cache-service response time alarm.","startTime":"2018-05-04 17:21","causeType":"SLOW_RESPONSE"},{"title":"Application portal-service response time alarm.","startTime":"2018-05-04 17:23","causeType":"SLOW_RESPONSE"}],"total":3},"serverAlarmList":{"items":[{"title":"Server ascrutae of Application cache-service response time alarm.","startTime":"2018-05-04 17:21","causeType":"SLOW_RESPONSE"},{"title":"Server ascrutae of Application portal-service response time alarm.","startTime":"2018-05-04 17:23","causeType":"SLOW_RESPONSE"}],"total":2}}}

ES Data: data.zip

wu-sheng commented 6 years ago

@ascrutae @liuhaoyang @candyleer Maybe we should run second round test again.

@ascrutae If your local test is OK, please consider to deploy this into demo env for us to do design review.

ascrutae commented 6 years ago

@candyleer @liuhaoyang The 5.0.0-beta tested and we are going to send the vote mail about 5.0-beta release next week. I will close this issue. Please open a new issue if anybody has any issue.

wu-sheng commented 6 years ago

Thanks everyone. Before and during release vote, feel free to open question. Especially for @liuhaoyang about .NET CORE related features. Native release supported is very important.

The PMC will consider the issue, whether it blocks the end user.