Open cachedout opened 4 years ago
After looking into this more, this seems as though it is almost certainly a problem with the way that Maven is handling the two sets of tests which are being passed to it. I cannot detect any duplication being introduced by the CI or by the test scripts which call Maven.
I don't think this a problem with the CI or with automation but instead I think this needs to be further investigated by somebody on the apm-agent-java
team who understands how Maven determines which tests should be executed given a single set of inputs.
Therefore, I am going to remove the automation
label from this and also remove my assignment in the hopes that somebody from the dev team can pick it up.
Always happy to discuss further, though, so feel free to grab me with questions. Thanks!
@kuisathaverat can you have a look? Specifically - https://github.com/elastic/apm-agent-java/blame/master/Jenkinsfile#L126-L185, https://github.com/elastic/apm-agent-java/blob/master/scripts/jenkins/smoketests-01.sh#L9 and https://github.com/elastic/apm-agent-java/blob/master/scripts/jenkins/smoketests-02.sh#L9
Ah, I didn't realize that our team had originally put the shell scripts in to divide those up. I can keep looking at this. I'll re-add my ownership. Sorry for the noise.
There you have the repeated test
curl https://apm-ci.elastic.co/job/apm-agent-java/job/apm-agent-java-mbp/job/master/218/testReport/api/json |jq '.suites[].cases[].name'|sort|uniq -c|sort
testStaticFile
is one test that is repeated 8 times, if we take a look to the name
of each case
is different
curl https://apm-ci.elastic.co/job/apm-agent-java/job/apm-agent-java-mbp/job/master/218/testReport/api/json |jq '.suites[]|select(.cases[].name=="testStaticFile")'
Checking the names of the cases
curl https://apm-ci.elastic.co/job/apm-agent-java/job/apm-agent-java-mbp/job/master/218/testReport/api/json |jq '.suites[]|select(.cases[].name=="testStaticFile")|.name'|sort
I have checked a few and all the cases are the same, they really are not repeated test, the name is the same but they are in a different class
I'm not sure I completely understand.
Take curl https://apm-ci.elastic.co/job/apm-agent-java/job/apm-agent-java-mbp/job/master/218/testReport/api/json |jq '.suites[]|select(.cases[].name=="testStaticFile")'
as the example.
There, I look at each case in turn. Looking at co.elastic.apm.spring.boot.SpringBoot1_5IT
, for example, I see two cases where this is the name of the case. In the first, we see:
"enclosingBlockNames": [
"Smoke Tests 02",
"Tests"
],
And in the second
"enclosingBlockNames": [
"Smoke Tests 01",
"Tests"
],
In the first example (Smoke Tests 02), the duration is listed as "duration": 13.866
and in the second(Smoke Tests 01), the duration is listed as "duration": 13.967
.
Am I looking at the correct key?
If you filter by name and group the data on each test case you can see that there is not the same test, ~the SpringBoot
flavor is different.~ there are 4 SpringBoot
flavor, and the tests are repeated as you showed on the smoke test 1
and smoke test 2
groups
curl https://apm-ci.elastic.co/job/apm-agent-java/job/apm-agent-java-mbp/job/master/218/testReport/api/json |jq '.suites[]|{name: .name, block: .enclosingBlockNames,caseName: .cases[].name}|select(.caseName=="testStaticFile")'
{
"name": "co.elastic.apm.spring.boot.SpringBoot1_5IT",
"block": [
"Smoke Tests 02",
"Tests"
],
"caseName": "testStaticFile"
}
{
"name": "co.elastic.apm.spring.boot.SpringBootJettyIT",
"block": [
"Smoke Tests 02",
"Tests"
],
"caseName": "testStaticFile"
}
{
"name": "co.elastic.apm.spring.boot.SpringBootTomcatIT",
"block": [
"Smoke Tests 02",
"Tests"
],
"caseName": "testStaticFile"
}
{
"name": "co.elastic.apm.spring.boot.SpringBootUndertowIT",
"block": [
"Smoke Tests 02",
"Tests"
],
"caseName": "testStaticFile"
}
{
"name": "co.elastic.apm.spring.boot.SpringBoot1_5IT",
"block": [
"Smoke Tests 01",
"Tests"
],
"caseName": "testStaticFile"
}
{
"name": "co.elastic.apm.spring.boot.SpringBootJettyIT",
"block": [
"Smoke Tests 01",
"Tests"
],
"caseName": "testStaticFile"
}
{
"name": "co.elastic.apm.spring.boot.SpringBootTomcatIT",
"block": [
"Smoke Tests 01",
"Tests"
],
"caseName": "testStaticFile"
}
{
"name": "co.elastic.apm.spring.boot.SpringBootUndertowIT",
"block": [
"Smoke Tests 01",
"Tests"
],
"caseName": "testStaticFile"
}
There are 109 test repeated of 1090, I dunno exactly why, maybe because they are repeated in some way, the smoke groups look for test into two different folders integration-tests
and apm-agent-plugins
and try to run only the integration-test-only
, I remember that make the filter was a really big headache
curl https://apm-ci.elastic.co/job/apm-agent-java/job/apm-agent-java-mbp/job/master/218/testReport/api/json |jq '.suites[]|{name: .cases[].name,class: .name}|(.name + " - " + .class)'|sort|uniq -c|sort| grep -v '^\s*1\s'
2 "co.elastic.apm.agent.kafka.KafkaIT - co.elastic.apm.agent.kafka.KafkaIT"
2 "co.elastic.apm.agent.kafka.KafkaLegacyBrokerIT - co.elastic.apm.agent.kafka.KafkaLegacyBrokerIT"
2 "co.elastic.apm.agent.mongoclient.MongoClientAsyncInstrumentationIT - co.elastic.apm.agent.mongoclient.MongoClientAsyncInstrumentationIT"
2 "co.elastic.apm.servlet.WebLogicIT - co.elastic.apm.servlet.WebLogicIT"
2 "greetingShouldReturnDefaultMessage - co.elastic.apm.spring.boot.SpringBoot1_5IT"
2 "greetingShouldReturnDefaultMessage - co.elastic.apm.spring.boot.SpringBootJettyIT"
2 "greetingShouldReturnDefaultMessage - co.elastic.apm.spring.boot.SpringBootTomcatIT"
2 "greetingShouldReturnDefaultMessage - co.elastic.apm.spring.boot.SpringBootUndertowIT"
2 "testAllScenarios[JBoss jboss-eap-6/eap64-openshift] - co.elastic.apm.servlet.JBossIT"
2 "testAllScenarios[JBoss jboss-eap-7/eap70-openshift] - co.elastic.apm.servlet.JBossIT"
2 "testAllScenarios[JBoss jboss-eap-7/eap71-openshift] - co.elastic.apm.servlet.JBossIT"
2 "testAllScenarios[JBoss jboss-eap-7/eap72-openshift] - co.elastic.apm.servlet.JBossIT"
2 "testAllScenarios[Jetty 9.2] - co.elastic.apm.servlet.JettyIT"
2 "testAllScenarios[Jetty 9.3] - co.elastic.apm.servlet.JettyIT"
2 "testAllScenarios[Jetty 9.4] - co.elastic.apm.servlet.JettyIT"
2 "testAllScenarios[Payara 4.181] - co.elastic.apm.servlet.PayaraIT"
2 "testAllScenarios[Payara 5.182] - co.elastic.apm.servlet.PayaraIT"
2 "testAllScenarios[Tomcat 7-jre7-slim] - co.elastic.apm.servlet.TomcatIT"
2 "testAllScenarios[Tomcat 8.5-jre8-slim] - co.elastic.apm.servlet.TomcatIT"
2 "testAllScenarios[Tomcat 8.5.0-jre8] - co.elastic.apm.servlet.TomcatIT"
2 "testAllScenarios[Tomcat 9-jre10-slim] - co.elastic.apm.servlet.TomcatIT"
2 "testAllScenarios[Tomcat 9-jre11-slim] - co.elastic.apm.servlet.TomcatIT"
2 "testAllScenarios[Tomcat 9-jre9-slim] - co.elastic.apm.servlet.TomcatIT"
2 "testAllScenarios[WebSphere 8.5.5] - co.elastic.apm.servlet.WebSphereIT"
2 "testAllScenarios[WebSphere webProfile7] - co.elastic.apm.servlet.WebSphereIT"
2 "testAllScenarios[Wildfly 10.0.0.Final] - co.elastic.apm.servlet.WildFlyIT"
2 "testAllScenarios[Wildfly 11.0.0.Final] - co.elastic.apm.servlet.WildFlyIT"
2 "testAllScenarios[Wildfly 12.0.0.Final] - co.elastic.apm.servlet.WildFlyIT"
2 "testAllScenarios[Wildfly 13.0.0.Final] - co.elastic.apm.servlet.WildFlyIT"
2 "testAllScenarios[Wildfly 14.0.0.Final] - co.elastic.apm.servlet.WildFlyIT"
2 "testAllScenarios[Wildfly 15.0.0.Final] - co.elastic.apm.servlet.WildFlyIT"
2 "testAllScenarios[Wildfly 16.0.0.Final] - co.elastic.apm.servlet.WildFlyIT"
2 "testAllScenarios[Wildfly 8.2.1.Final] - co.elastic.apm.servlet.WildFlyIT"
2 "testAllScenarios[Wildfly 9.0.0.Final] - co.elastic.apm.servlet.WildFlyIT"
2 "testBodyCaptureEnabled - co.elastic.apm.agent.kafka.KafkaLegacyClientIT"
2 "testCountCollection - co.elastic.apm.agent.mongoclient.MongoClientSyncInstrumentationIT"
2 "testCreateAndDeleteIndex[Async=false] - co.elastic.apm.agent.es.restclient.v5_6.ElasticsearchRestClientInstrumentationIT"
2 "testCreateAndDeleteIndex[Async=false] - co.elastic.apm.agent.es.restclient.v6_4.ElasticsearchRestClientInstrumentationIT"
2 "testCreateAndDeleteIndex[Async=true] - co.elastic.apm.agent.es.restclient.v5_6.ElasticsearchRestClientInstrumentationIT"
2 "testCreateAndDeleteIndex[Async=true] - co.elastic.apm.agent.es.restclient.v6_4.ElasticsearchRestClientInstrumentationIT"
2 "testCreateCollection - co.elastic.apm.agent.mongoclient.MongoClientSyncInstrumentationIT"
2 "testCreateDocument - co.elastic.apm.agent.mongoclient.MongoClientSyncInstrumentationIT"
2 "testDeleteDocument - co.elastic.apm.agent.mongoclient.MongoClientSyncInstrumentationIT"
2 "testDestinationAddressCollectionDisabled - co.elastic.apm.agent.kafka.KafkaLegacyClientIT"
2 "testDocumentScenario[Async=false] - co.elastic.apm.agent.es.restclient.v5_6.ElasticsearchRestClientInstrumentationIT"
2 "testDocumentScenario[Async=false] - co.elastic.apm.agent.es.restclient.v6_4.ElasticsearchRestClientInstrumentationIT"
2 "testDocumentScenario[Async=true] - co.elastic.apm.agent.es.restclient.v5_6.ElasticsearchRestClientInstrumentationIT"
2 "testDocumentScenario[Async=true] - co.elastic.apm.agent.es.restclient.v6_4.ElasticsearchRestClientInstrumentationIT"
2 "testFindDocument - co.elastic.apm.agent.mongoclient.MongoClientSyncInstrumentationIT"
2 "testHeaderCaptureDisabled - co.elastic.apm.agent.kafka.KafkaLegacyClientIT"
2 "testHeaderSanitation - co.elastic.apm.agent.kafka.KafkaLegacyClientIT"
2 "testIgnoreTopic - co.elastic.apm.agent.kafka.KafkaLegacyClientIT"
2 "testLettuce[[biz.paluch.redis:lettuce:4.0.2.Final, io.netty:netty-all:4.0.30.Final]] - co.elastic.apm.agent.redis.lettuce.Lettuce4VersionsIT"
2 "testLettuce[[biz.paluch.redis:lettuce:4.1.2.Final, io.netty:netty-all:4.0.34.Final, org.latencyutils:LatencyUtils:2.0.3]] - co.elastic.apm.agent.redis.lettuce.Lettuce4VersionsIT"
2 "testLettuce[[biz.paluch.redis:lettuce:4.2.2.Final, io.netty:netty-all:4.0.40.Final]] - co.elastic.apm.agent.redis.lettuce.Lettuce4VersionsIT"
2 "testLettuce[[biz.paluch.redis:lettuce:4.3.3.Final, io.netty:netty-all:4.1.13.Final]] - co.elastic.apm.agent.redis.lettuce.Lettuce4VersionsIT"
2 "testLettuce[[biz.paluch.redis:lettuce:4.4.6.Final, io.netty:netty-all:4.1.24.Final]] - co.elastic.apm.agent.redis.lettuce.Lettuce4VersionsIT"
2 "testLettuce[[biz.paluch.redis:lettuce:4.5.0.Final, io.netty:netty-all:4.1.29.Final]] - co.elastic.apm.agent.redis.lettuce.Lettuce4VersionsIT"
2 "testLettuce[[io.lettuce:lettuce-core:5.0.5.RELEASE, io.netty:netty-all:4.1.28.Final]] - co.elastic.apm.agent.redis.lettuce.Lettuce5VersionsIT"
2 "testLettuce[[io.lettuce:lettuce-core:5.1.8.RELEASE, io.netty:netty-all:4.1.38.Final]] - co.elastic.apm.agent.redis.lettuce.Lettuce5VersionsIT"
2 "testLettuce[[io.lettuce:lettuce-core:5.2.1.RELEASE, io.netty:netty-all:4.1.43.Final]] - co.elastic.apm.agent.redis.lettuce.Lettuce5VersionsIT"
2 "testListCollections - co.elastic.apm.agent.mongoclient.MongoClientSyncInstrumentationIT"
2 "testScenarioAsBulkRequest[Async=false] - co.elastic.apm.agent.es.restclient.v5_6.ElasticsearchRestClientInstrumentationIT"
2 "testScenarioAsBulkRequest[Async=false] - co.elastic.apm.agent.es.restclient.v6_4.ElasticsearchRestClientInstrumentationIT"
2 "testScenarioAsBulkRequest[Async=true] - co.elastic.apm.agent.es.restclient.v5_6.ElasticsearchRestClientInstrumentationIT"
2 "testScenarioAsBulkRequest[Async=true] - co.elastic.apm.agent.es.restclient.v6_4.ElasticsearchRestClientInstrumentationIT"
2 "testSendTwoRecords_IterableFor - co.elastic.apm.agent.kafka.KafkaLegacyClientIT"
2 "testSendTwoRecords_IterableForEach - co.elastic.apm.agent.kafka.KafkaLegacyClientIT"
2 "testSendTwoRecords_IterableSpliterator - co.elastic.apm.agent.kafka.KafkaLegacyClientIT"
2 "testSendTwoRecords_PartiallyIterate - co.elastic.apm.agent.kafka.KafkaLegacyClientIT"
2 "testSendTwoRecords_RecordListIterableFor - co.elastic.apm.agent.kafka.KafkaLegacyClientIT"
2 "testSendTwoRecords_RecordListIterableForEach - co.elastic.apm.agent.kafka.KafkaLegacyClientIT"
2 "testSendTwoRecords_RecordListSubList - co.elastic.apm.agent.kafka.KafkaLegacyClientIT"
2 "testSendTwoRecords_RecordsIterable - co.elastic.apm.agent.kafka.KafkaLegacyClientIT"
2 "testStaticFile - co.elastic.apm.spring.boot.SpringBoot1_5IT"
2 "testStaticFile - co.elastic.apm.spring.boot.SpringBootJettyIT"
2 "testStaticFile - co.elastic.apm.spring.boot.SpringBootTomcatIT"
2 "testStaticFile - co.elastic.apm.spring.boot.SpringBootUndertowIT"
2 "testTryToDeleteNonExistingIndex[Async=false] - co.elastic.apm.agent.es.restclient.v5_6.ElasticsearchRestClientInstrumentationIT"
2 "testTryToDeleteNonExistingIndex[Async=false] - co.elastic.apm.agent.es.restclient.v6_4.ElasticsearchRestClientInstrumentationIT"
2 "testTryToDeleteNonExistingIndex[Async=true] - co.elastic.apm.agent.es.restclient.v5_6.ElasticsearchRestClientInstrumentationIT"
2 "testTryToDeleteNonExistingIndex[Async=true] - co.elastic.apm.agent.es.restclient.v6_4.ElasticsearchRestClientInstrumentationIT"
2 "testUpdateDocument - co.elastic.apm.agent.mongoclient.MongoClientSyncInstrumentationIT"
2 "testVersions[2.4.0] - co.elastic.apm.agent.kafka.KafkaClientVersionsIT"
2 "testVersions[3.0.4] - co.elastic.apm.agent.mongoclient.MongoClientSyncVersionIT"
2 "testVersions[3.1.1] - co.elastic.apm.agent.mongoclient.MongoClientSyncVersionIT"
2 "testVersions[3.10.2] - co.elastic.apm.agent.mongoclient.MongoClientSyncVersionIT"
2 "testVersions[3.11.1] - co.elastic.apm.agent.mongoclient.MongoClientSyncVersionIT"
2 "testVersions[3.2.2] - co.elastic.apm.agent.mongoclient.MongoClientSyncVersionIT"
2 "testVersions[3.3.0] - co.elastic.apm.agent.mongoclient.MongoClientSyncVersionIT"
2 "testVersions[3.4.3] - co.elastic.apm.agent.mongoclient.MongoClientSyncVersionIT"
2 "testVersions[3.5.0] - co.elastic.apm.agent.mongoclient.MongoClientSyncVersionIT"
2 "testVersions[3.6.4] - co.elastic.apm.agent.mongoclient.MongoClientSyncVersionIT"
2 "testVersions[3.7.1] - co.elastic.apm.agent.mongoclient.MongoClientSyncVersionIT"
2 "testVersions[3.8.2] - co.elastic.apm.agent.mongoclient.MongoClientSyncVersionIT"
2 "test[c3p0] - co.elastic.apm.agent.jdbc.DataSourceIT"
2 "test[dbcp2] - co.elastic.apm.agent.jdbc.DataSourceIT"
2 "test[dbcp] - co.elastic.apm.agent.jdbc.DataSourceIT"
2 "test[druid] - co.elastic.apm.agent.jdbc.DataSourceIT"
2 "test[hikari] - co.elastic.apm.agent.jdbc.DataSourceIT"
2 "test[jdbc:tc:db2:11.5.0.0a://hostname/databasename db2] - co.elastic.apm.agent.jdbc.JdbcDbIT"
2 "test[jdbc:tc:mariadb:10://hostname/databasename mariadb] - co.elastic.apm.agent.jdbc.JdbcDbIT"
2 "test[jdbc:tc:mysql:5://hostname/databasename mysql] - co.elastic.apm.agent.jdbc.JdbcDbIT"
2 "test[jdbc:tc:postgresql:10://hostname/databasename postgresql] - co.elastic.apm.agent.jdbc.JdbcDbIT"
2 "test[jdbc:tc:postgresql:9://hostname/databasename postgresql] - co.elastic.apm.agent.jdbc.JdbcDbIT"
2 "test[jdbc:tc:sqlserver:2017-CU12://hostname/databasename sqlserver] - co.elastic.apm.agent.jdbc.JdbcDbIT"
2 "test[tomcat] - co.elastic.apm.agent.jdbc.DataSourceIT"
2 "test[vibur] - co.elastic.apm.agent.jdbc.DataSourceIT"
4 "testVersions[3.9.0] - co.elastic.apm.agent.mongoclient.MongoClientSyncVersionIT"
there are 154 test on the Smoke 01 group, 144 on the Smoke 02 group, and 792 on the unit test group
curl https://apm-ci.elastic.co/job/apm-agent-java/job/apm-agent-java-mbp/job/master/218/testReport/api/json |jq '.suites[]|select(.enclosingBlockNames[0]=="Smoke Tests 01")|.cases[].name'|sort|wc -l
curl https://apm-ci.elastic.co/job/apm-agent-java/job/apm-agent-java-mbp/job/master/218/testReport/api/json |jq '.suites[]|select(.enclosingBlockNames[0]=="Smoke Tests 02")|.cases[].name'|sort|wc -l
curl https://apm-ci.elastic.co/job/apm-agent-java/job/apm-agent-java-mbp/job/master/218/testReport/api/json |jq '.suites[]|select(.enclosingBlockNames[0]=="Unit Tests")|.cases[].name'|sort|wc -l
curl https://apm-ci.elastic.co/job/apm-agent-java/job/apm-agent-java-mbp/job/master/218/testReport/api/json |jq '.suites[]|select(.enclosingBlockNames[0]!="Smoke Tests 02" and .enclosingBlockNames[0]!="Smoke Tests 01" and .enclosingBlockNames[0]!="Unit Tests")|.cases[].name'|sort|wc -l
There are 109 test repeated of 1090
IMHO, this number is low enough that it may not be worth the time to investigate further why this is happening.
@eyalkoren Does this have any impact on the team's development process other than seeing potentially duplicated tests in smoketest results? Totally happy to have us keep digging here but I want to make sure we're using our time wisely. :)
It is not a low number as all the repeated ones are out of the smoke tests, meaning- instead of running 189 of those, we run 298. And more importantly- those are the ones that actually take most of the time, as those are all the slow integration tests.
Regarding the impact on the team- the shortest it runs, the better for us, so if we reduce from ~45 minutes to ~25 minutes, this is meaningful. Otherwise, we don't need smokeTests1
and smokeTests2
as there is no parallelization, we can just have one smokeTests
to run them all...
If a bit more investigation does not find out anything, I think we can run all smoke tests as one phase.
Regarding the impact on the team- the shortest it runs, the better for us, so if we reduce from ~45 minutes to ~25 minutes, this is meaningful. Otherwise, we don't need smokeTests1 and smokeTests2 as there is no parallelization, we can just have one smokeTests to run them all...
there are 154 tests on the Smoke 01 group, 144 on the Smoke 02 group, and 792 on the unit test group. We are running twice 109 in parallel, so we run (154-109= 45) + (144-109=35) = 80 tests on parallel
that are not running in parallel if we run the smoke test directly.
The solution it is to categorize the test with annotation to have a proper filter by using profiles
Let me just update this issue really quickly before I close up shop for this evening. :)
I tried to remove the parallelism for the smoke tests over in https://github.com/elastic/apm-agent-java/pull/1014. This ended up not producing any interesting results because we didn't have enough logging enabled to be able to tell when tests are actually being executed. I have removed the -q
flag from Maven and I'm starting another test run.
Related to an earlier point, the first time I tried to run the smoke tests sequentially, the entire test suite timed out after an hour which leads me to believe, at least anecdotally, that removing the current configuration in favor of a single smoke test execution might increase rather than decrease the execution time overall.
More news to follow. :)
Related to an earlier point, the first time I tried to run the smoke tests sequentially, the entire test suite timed out after an hour which leads me to believe, at least anecdotally, that removing the current configuration in favor of a single smoke test execution might increase rather than decrease the execution time overall.
I am sure that running a single smoke test execution is not going to reduce time spent on testing. I may be wrong, but I think that the bulk of smoke tests 1 and 2 is spent on the same tests (BTW, thinking of 80 test running in parallel makes it seem effective, but another way to look at it is that if we cancel parallelism and run in one step, it would add 35 tests
to smoke tests 1). What I tried to say is that we spend a lot (maybe most) of one Jenkins node's time on duplicated tests. I DO really want this to be parallelised, but make it more efficient, if possible.
I am still not sure whether this duplication is only in the reports or the actual run though, but it seems you think there is a real duplication. The fact that the tests have similar but different durations support that.
These are the selection criteria:
Smoke tests 1: MOD=$(find apm-agent-plugins -maxdepth 1 -mindepth 1 -type d|grep -v "target"|tr "\n" ",")
Smoke tests 2: MOD=$(find integration-tests -maxdepth 1 -mindepth 1 -type d|grep -v "target"|tr "\n" ",")
The idea is that tests are picked based on our maven modules apm-agent-plugins
(where you would see technologies-related tests, like KafkaIT
, JDBC..IT
, Mongo..IT
etc.) and integration-tests
(where you would see servlet container tests, like TomcatIT
, JBossIT
etc., as well as Spring...IT
).
For some reason, it seems like they are not selected as intended. When running these find
commands on the project root, the selection seems to work as expected.
The line that runs the test is: ./mvnw -q -Dmaven.javadoc.skip=true -am -amd -pl ${MOD} -P integration-test-only verify
.
I am suspecting that the -am
/-amd
are messing these selections up due to module dependencies, so maybe try to look into that.
I don't know much about the usage of maven profiles (as in -P integration-test-only
), but I can't see how this may be the cause.
I think that what is happening is that the -am
flag is resolving to the same dependencies in both sets of tests. Compare the following, both adapted from what we do for the smoketests-02.sh
script.
Here is the output of the /mvnw dependency:tree -am -pl $(find integration-tests -maxdepth 1 -mindepth 1 -type d|grep -v "target"|tr "\n" ",")
with the -am
flag: https://gist.github.com/cachedout/e8654c395f255ddb08fa91b114f8310e
Here is the output of the same command with the -am
flag removed: https://gist.github.com/cachedout/935e1fcbf4e583de371092fa77da0a3a
The second gist is as I would expect correct isolation to look. As one can see in the first example, dependencies resolve all the way into the various apm agent plugins, which correlates to what we see in the test results. For an easy example, search for mongodb
in both and note its resolution in the first output but not the second.
Unfortunately, removing the -am
flag actually causes test failures: https://apm-ci.elastic.co/blue/organizations/jenkins/apm-agent-java%2Fapm-agent-java-mbp/detail/PR-1018/3/tests
I agree with the @kuisathaverat that the best long-term solution here would likely be to annotate the tests into groups. In the meantime, @eyalkoren, I can pull out the parallelization and just run everything as a single phase as you suggested.
Please let me know if you would like me to go ahead and do that. :)
Thanks!
@cachedout thanks for the investigation, this is very useful! The good news are that we can considerably reduce the build time with the same number of Jenkins nodes/jobs.
I think there is no need to do anything now, except deciding on how we annotate and move forward with this direction. Do you have a proposal for that? How do you technically propose to go about- through the poms?
here is the @mdelapenya's call, long history short, we have to categorize the tests to can run them in parallel by category.
we have to categorize the tests to can run them in parallel by category.
OK, let's do that. Let us know what we should do.
Let us know what we should do.
I think that broadly speaking, the steps are going to be as follows. (I've adopted these from poking around a bit in the hbase repo, which seems to have done this pretty well.)
Find a place to store test annotations. Here are some examples from hbase.
Create these annotations, corresponding to the original groups or new groups, if you like.
Apply annotations to test classes using @Category
. ex with pseudo-code:
import org.junit.experimental.categories.Category;
import my.path.to.annotations.SmokeTests01
@Category(SmokeTests01.class)
public class MongoClientSyncInstrumentationIT extends AbstractMongoClientInstrumentationTest {
<snip>
<configuration>
block of the Surefire plugin. Here is an arbitrary example of the needed elements:<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<excludedGroups>my.path.to.annotations.SmokeTest01</excludedGroups>
</configuration>
</plugin>
-Dgroups=my.path.to.annotations.SmokeTest01
flag with Maven.We may need to adjust course depending on what we find, but hopefully this should give you a clearer idea of how I think this should probably be done.
... and will support the paralellisation of the test in Jenkins, as @kuisathaverat mentioned!
In this sense, we are here ready to help in the definition of the test suites/categories! Please be our guests!
Oops. I didn't mean to close this one entirely.
Credit
Credit goes to @eyalkoren for the discovery of this issue
Overview
It appears that certain tests are duplicated across sets of smoke tests. For example, take the
greetingShouldReturnDefaultMessage
test. This test is located in theintegrations/test
directory, and can be seen here.If one examines a test run by opening a run and going to the Blue Ocean page, they can see that the same test appears to have run as a part of both the
Smoke Tests 01
group and theSmoke Tests 02
group.Details
Tests are segmented through the use of simple bash scripts. The scripts which provide this segmentation are below:
Smoke Tests 01 Smoke Tests 02(https://github.com/elastic/apm-agent-java/blob/master/scripts/jenkins/smoketests-02.sh)
Each script contains a preparatory step to generate the modules to be tested. Below is the output of each:
smoketest01
find apm-agent-plugins -maxdepth 1 -mindepth 1 -type d|grep -v "target"
smoketest02
find integration-tests -maxdepth 1 -mindepth 1 -type d|grep -v "target"|
Reasoning
It appears as though the preparatory step correctly separates the two sets of tests but that Maven may be running the same test(s) in each instance. Maven is executed with the following parameters:
./mvnw -q -Dmaven.javadoc.skip=true -am -amd -pl ${MOD} -P integration-test-only verify
(where $MOD is the set of modules from each prepatory step.)Could, perhaps, the
-am
or the-amd
flag inadvertently be triggering a larger set of tests? Or, is something in one each tests linking to the other set through a dependency of some kind?