Closed pg-yang closed 1 year ago
It's better that updating docker-compose.yml
in e2e-test case rather than base-compose.yml
Please don't do that. e2e only has 3U8G for the whole env, another server(redis) would impact the stability.
An optional plugin is able to activate. Is there any issue to do so? Showcase also includes optional plugin activation, such as Spring Gateway.
OK, I would study how to activate optional plugin in e2e
I wrote TopNCacheReadCommand.java like TopNDatabaseStatement
, It extend TopN
too
BTW TopN extend Record
.
I use this command to get topN record
swctl --display yaml --base-url=http://127.0.0.1:12800/graphql metrics top --start=-60m --name=top_n_cache_read_command --service-id=R3VhdmFDYWNoZS1sb2NhbA==.0
top_n_cache_read_command
is index(table)name , defined in TopNCacheReadCommand
And OAP cause exception
Caused by: org.h2.jdbc.JdbcSQLSyntaxErrorException: Column "ENTITY_ID" not found; SQL statement:
select * from (select avg(latency) result,entity_id from top_n_cache_read_command where time_bucket >= ? and time_bucket <= ? and service_id=? group by entity_id) as T order by result desc limit 5 [42122-212]
I have confirm TopN class don't define ENTITY_ID
, But Metrics class defined . I could'n find other example about TopN metrics .
So , I'm not sure the error is caused by swctl command or my code
The complete error stack in OAP
Caused by: org.h2.jdbc.JdbcSQLSyntaxErrorException: Column "ENTITY_ID" not found; SQL statement:
select * from (select avg(latency) result,entity_id from top_n_cache_read_command where time_bucket >= ? and time_bucket <= ? and service_id=? group by entity_id) as T order by result desc limit 5 [42122-212]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:502) ~[h2-2.1.212.jar:2.1.212]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:477) ~[h2-2.1.212.jar:2.1.212]
at org.h2.message.DbException.get(DbException.java:223) ~[h2-2.1.212.jar:2.1.212]
at org.h2.message.DbException.get(DbException.java:199) ~[h2-2.1.212.jar:2.1.212]
at org.h2.expression.ExpressionColumn.getColumnException(ExpressionColumn.java:244) ~[h2-2.1.212.jar:2.1.212]
at org.h2.expression.ExpressionColumn.optimizeOther(ExpressionColumn.java:226) ~[h2-2.1.212.jar:2.1.212]
at org.h2.expression.ExpressionColumn.optimize(ExpressionColumn.java:213) ~[h2-2.1.212.jar:2.1.212]
at org.h2.command.query.Select.prepareExpressions(Select.java:1170) ~[h2-2.1.212.jar:2.1.212]
at org.h2.table.DerivedTable.<init>(DerivedTable.java:41) ~[h2-2.1.212.jar:2.1.212]
at org.h2.command.query.Query.toTable(Query.java:1004) ~[h2-2.1.212.jar:2.1.212]
at org.h2.command.Parser.readDerivedTableWithCorrelation(Parser.java:1936) ~[h2-2.1.212.jar:2.1.212]
at org.h2.command.Parser.readTablePrimary(Parser.java:1816) ~[h2-2.1.212.jar:2.1.212]
at org.h2.command.Parser.readTableReference(Parser.java:2359) ~[h2-2.1.212.jar:2.1.212]
at org.h2.command.Parser.parseSelectFromPart(Parser.java:2810) ~[h2-2.1.212.jar:2.1.212]
at org.h2.command.Parser.parseSelect(Parser.java:2917) ~[h2-2.1.212.jar:2.1.212]
at org.h2.command.Parser.parseQueryPrimary(Parser.java:2800) ~[h2-2.1.212.jar:2.1.212]
at org.h2.command.Parser.parseQueryTerm(Parser.java:2667) ~[h2-2.1.212.jar:2.1.212]
at org.h2.command.Parser.parseQueryExpressionBody(Parser.java:2646) ~[h2-2.1.212.jar:2.1.212]
at org.h2.command.Parser.parseQueryExpressionBodyAndEndOfQuery(Parser.java:2639) ~[h2-2.1.212.jar:2.1.212]
at org.h2.command.Parser.parseQueryExpression(Parser.java:2632) ~[h2-2.1.212.jar:2.1.212]
at org.h2.command.Parser.parseQuery(Parser.java:2593) ~[h2-2.1.212.jar:2.1.212]
at org.h2.command.Parser.parsePrepared(Parser.java:724) ~[h2-2.1.212.jar:2.1.212]
at org.h2.command.Parser.parse(Parser.java:691) ~[h2-2.1.212.jar:2.1.212]
at org.h2.command.Parser.parse(Parser.java:661) ~[h2-2.1.212.jar:2.1.212]
at org.h2.command.Parser.prepareCommand(Parser.java:568) ~[h2-2.1.212.jar:2.1.212]
at org.h2.engine.SessionLocal.prepareLocal(SessionLocal.java:631) ~[h2-2.1.212.jar:2.1.212]
at org.h2.engine.SessionLocal.prepareCommand(SessionLocal.java:554) ~[h2-2.1.212.jar:2.1.212]
at org.h2.jdbc.JdbcConnection.prepareCommand(JdbcConnection.java:1116) ~[h2-2.1.212.jar:2.1.212]
at org.h2.jdbc.JdbcPreparedStatement.<init>(JdbcPreparedStatement.java:92) ~[h2-2.1.212.jar:2.1.212]
at org.h2.jdbc.JdbcConnection.prepareStatement(JdbcConnection.java:288) ~[h2-2.1.212.jar:2.1.212]
at com.zaxxer.hikari.pool.ProxyConnection.prepareStatement(ProxyConnection.java:318) ~[HikariCP-3.1.0.jar:?]
at com.zaxxer.hikari.pool.HikariProxyConnection.prepareStatement(HikariProxyConnection.java) ~[HikariCP-3.1.0.jar:?]
at org.apache.skywalking.oap.server.library.client.jdbc.hikaricp.JDBCHikariCPClient.executeQuery(JDBCHikariCPClient.java:119) ~[classes/:?]
at org.apache.skywalking.oap.server.storage.plugin.jdbc.h2.dao.H2AggregationQueryDAO.sortMetrics(H2AggregationQueryDAO.java:70) ~[classes/:?]
at org.apache.skywalking.oap.server.core.query.AggregationQueryService.sortMetrics(AggregationQueryService.java:63) ~[classes/:?]
at org.apache.skywalking.oap.query.graphql.resolver.MetricsQuery.sortMetrics(MetricsQuery.java:151) ~[classes/:?]
at sun.reflect.GeneratedMethodAccessor95.invoke(Unknown Source) ~[?:?]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_321]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_321]
at graphql.kickstart.tools.resolver.MethodFieldResolverDataFetcher.get(MethodFieldResolver.kt:280) ~[graphql-java-tools-12.0.2.jar:?]
The http info
.b...B&.POST /graphql HTTP/1.1
Host: 127.0.0.1:12800
User-Agent: Go-http-client/1.1
Content-Length: 1241
Accept: application/json; charset=utf-8
Content-Type: application/json; charset=utf-8
Accept-Encoding: gzip
{"query":"# Licensed to Apache Software Foundation (ASF) under one or more contributor\n# license agreements. See the NOTICE file distributed with\n# this work for additional information regarding copyright\n# ownership. Apache Software Foundation (ASF) licenses this file to you under\n# the Apache License, Version 2.0 (the \"License\"); you may\n# not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing,\n# software distributed under the License is distributed on an\n# \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n# KIND, either express or implied. See the License for the\n# specific language governing permissions and limitations\n# under the License.\n\nquery ($condition:TopNCondition!, $duration: Duration!) {\n result: sortMetrics(condition: $condition, duration: $duration) {\n name\n value\n }\n}\n","variables":{"condition":{"name":"top_n_cache_read_command","parentService":"GuavaCache-local","normal":false,"scope":"Service","topN":5,"order":"DES"},"duration":{"start":"2022-09-13 1853","end":"2022-09-13 1923","step":"MINUTE"}}}
19:53:18.719900 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 52, bad cksum 0 (->3cc2)!)
localhost.12800 > localhost.51771: Flags [.], cksum 0xfe28 (incorrect -> 0xe9c6), seq 221, ack 2608, win 6338, options [nop,nop,TS val 2269259431 ecr 3479306895], length 0
E..4..@.@...........2..;.u....XH.....(.....
.B&..b..
19:53:18.738493 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 200, bad cksum 0 (->3c2e)!)
localhost.12800 > localhost.51771: Flags [P.], cksum 0xfebc (incorrect -> 0xe882), seq 221:369, ack 2608, win 6338, options [nop,nop,TS val 2269259450 ecr 3479306895], length 148
E.....@.@...........2..;.u....XH...........
.B&..b..HTTP/1.1 200 OK
content-type: application/json; charset=utf-8
content-length: 196
server: Armeria/1.18.0
date: Tue, 13 Sep 2022 11:53:18 GMT
UI invoke readSampledRecords
method in OAP . it's different with swctl
Have you checked the UI about what the slow SQL statement actually requests?
There are several concepts you missed.
TopN
is not a metric type, it is a record.TopNWorker
mechanism, it is reported perioedly, which is controlled through SW_CORE_TOPN_REPORT_PERIOD
.(10min default) You most likely face timeout if you want to test this in the e2e actually. We prefer you do manually testing locally.Got it . my shell is incorrect . I misunderstood the subcommand top
swctl --display yaml --base-url=http://127.0.0.1:12800/graphql metrics -h
top query the top <n> entities sorted by the specified metrics
This top don't mean TopN record
. emmm
If set SW_CORE_TOPN_REPORT_PERIOD
small enough , Doesn't e2e work correctly ?
Actually , In order to test TopN . I modified the code about SW_CORE_TOPN_REPORT_PERIOD
on my PC .
If set
SW_CORE_TOPN_REPORT_PERIOD
small enough , Doesn't e2e work correctly ?
It could be, just be careful about the settings, if you generate too many traffics, and this period is too small, the storage could be overload easily.
I have finished the most work of this issue , I think .
Did you prepare virtual-cache
doc ?
https://github.com/apache/skywalking-java/pull/303#issuecomment-1240112282
Did you prepare virtual-cache doc ? https://github.com/apache/skywalking-java/pull/303#issuecomment-1240112282
What do you mean I prepare
? Do you want me to write that up?
Sorry , I thought you want prepare it .
I would rename virtual-database.md
to virtual-service.md
, and add content for virtual cache .
rename slow-db-statement.md
to slow-virtual-service-statement
, add virtual cache content too
Sorry , I thought you want prepare it .
If you face any trouble to write one, I would help. If not, I think it should be good to be written by you. After all, this is your PR :)
No trouble , I misunderstood your meaning . Haha .
BTW, I think CLI doesn't support querying the sampled record. You and @yswdqz (Slow SQL - MySQL server statistic https://github.com/apache/skywalking/pull/9575) noticed this issue today.
@yswdqz told me he may want to try this, but he is not familiar with golang much. How about your golang skill? @pg-yang
not familiar with golang either . I read https://github.com/apache/skywalking-infra-e2e simply . The code is not a lot. I debugged this project for my test case today .
not familiar with golang either . I read https://github.com/apache/skywalking-infra-e2e simply . The code is not a lot. I debugged this project for my test case today .
swctl is actually from https://github.com/apache/skywalking-cli. infra-e2e is just a scheduler to push the tests forward step by step. SkyWalking has a lot of sub-projects, sorry.
@mrproliu @kezhenxu94 If you have time to help on the CLI part, it would be easier.
SkyWalking has a lot of sub-projects
I realized it .
e2e for trigger , verify ,etc skywalking-cli for interaction with OAP .
I would like support this feature , but i'm golang newer too . It may take some time to familiarize these projects
@mrproliu @kezhenxu94 If you have time to help on the CLI part, it would be easier.
Sure, I can handle it. I will add metrics sorted
and metrics sampled-record
commands to split the queries.
@pg-yang @yswdqz https://github.com/apache/skywalking-cli/pull/163 This has been done. You could update the commit ID of the swctl.
Ask for help . my e2e test case failure .
error message
WARNING failed to verify case[swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics sampled-record --name=top_n_cache_read_command --service-id=R3VhdmFDYWNoZS1sb2NhbA==.0 | yq e 'to_entries | with(.[] ; .value=(.value | to_entries))' -
], retried 30 time(s):
DETAILS failed to verify the output: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics sampled-record --name=top_n_cache_read_command --service-id=R3VhdmFDYWNoZS1sb2NhbA==.0 | yq e 'to_entries | with(.[] ; .value=(.value | to_entries))' -
, error:
failed to execute template: template: test:21:23: executing "test" at <.value>: wrong type for value; expected string; got interface {}
SUMMARY 10 passed
1 failed
8 skipped
actual data
- key: 0
value:
- key: name
value: getIfPresent user_1
- key: id
value: ""
- key: value
value: "5"
- key: refid
value: null
- key: 1
value:
- key: name
value: getIfPresent user_1
- key: id
value: ""
- key: value
value: "1"
- key: refid
value: null
- key: 2
value:
- key: name
value: getIfPresent user_1
- key: id
value: ""
- key: value
value: "1"
- key: refid
value: null
- key: 3
value:
- key: name
value: getIfPresent user_1
- key: id
value: ""
- key: value
value: "1"
- key: refid
value: null
- key: 4
value:
- key: name
value: getIfPresent user_1
- key: id
value: ""
- key: value
value: "0"
- key: refid
value: null
expect data :
{{- contains . }}
- key: 0
value:
{{- contains .value }}
- key: name
value: {{ notEmpty .value}}
- key: id
value:
- key: value
value: {{ notEmpty .value}}
- key: refid
value: null
{{- end }}
- key: 1
value:
{{- contains .value }}
- key: name
value: {{ notEmpty .value}}
- key: id
value:
- key: value
value: {{ notEmpty .value}}
- key: refid
value: null
{{- end }}
- key: 2
value:
{{- contains .value }}
- key: name
value: {{ notEmpty .value}}
- key: id
value:
- key: value
value: {{ notEmpty .value}}
- key: refid
value: null
{{- end }}
- key: 3
value:
{{- contains .value }}
- key: name
value: {{ notEmpty .value}}
- key: id
value:
- key: value
value: {{ notEmpty .value}}
- key: refid
value: null
{{- end }}
- key: 4
value:
{{- contains .value }}
- key: name
value: {{ notEmpty .value}}
- key: id
value:
- key: value
value: {{ notEmpty .value}}
- key: refid
value: null
{{- end }}
{{- end }}
e2e golang stack
this command is used for querying actual data
swctl --display yaml --base-url=http://127.0.0.1:53083/graphql metrics sampled-record --name=top_n_cache_read_command --service-id=R3VhdmFDYWNoZS1sb2NhbA==.0 | yq e 'to_entries | with(.[] ; .value=(.value | to_entries))' -
Actually , I don't know the grammar such as {{- contains .value }}
, {{- contains . }}
,etc .
Could you tell me what's the grammar ?
Check this as a good example, https://github.com/apache/skywalking/pull/9575. And e2e tool has a bug fixed.
I pulled e2e , and rebuild it . It's right now . Thanks .
The similar error caused by latest e2e.
map[interface{}]interface{}{
- string("codesignature"): string("test.apache.skywalking.e2e.profile.ProfileController.createAuthor:-1"),
+ string("codesignature"): string("java.lang.Thread.sleep:-2"),
string("count"): int(1),
- string("duration"): string("<wanted gt 0, but was <interface {} Value>>"),
+ string("duration"): int(0),
string("durationchildexcluded"): int(0),
string("id"): string("66"),
string("parentid"): string("65"),
},
https://github.com/pg-yang/skywalking/actions/runs/3065040570/jobs/4948863749
I think profiling has no relative to your change?
yes , it's not related with my code . but I update e2e and java agent commit id to lasted
It should be passed after rerun. e2e tool has been upgraded in master branch due to latest merged PR, all passed.
The last step has been submitted, https://github.com/apache/skywalking-showcase/pull/77.
Final work . It feels like a long time
Other PR also failed , on similar error like
- string("logs"): nil,
+ string("logs"): []interface{}{},
Which PR? That error usually means logs are not collected. Maybe some tests are not stable.
Final work . It feels like a long time
It wouldn't be easy to add a core feature at level one menu. Thanks for your patience.
Which PR? That error usually means logs are not collected. Maybe some tests are not stable.
https://github.com/apache/skywalking/pull/9632 CI was rerun by you or anyone else after it failed one time . Right ?
Yes, I did.
There was two PR need to rerun at least , so it should be common error , maybe because some tests are not stable . Anyway , it's uncomfortable experience
I don't know whether there is an error, but considering GHA provides 3u8g resources(max), keeping in stable is very hard.
Search before asking
Description
Virtual Cache will analysis span which is cache-related . It will generate SLA, response time or some else metric In a word , it's similar to Virtual Database Reference virtual-database skywalking demo
Works should to be done :
Agent (focus on Java agent)
Span’s layer
,tag
if necessary .The known cache-related plugin(need suggestion : which plugin is cache-related):
Backend
Use case
No response
Related issues
No response
Are you willing to submit a PR?
Code of Conduct