thelastpickle / cassandra-reaper

Automated Repair Awesomeness for Apache Cassandra
http://cassandra-reaper.io/
Apache License 2.0
487 stars 217 forks source link

Get upgrade integration tests to run using `DropwizardTestSupport` inside classloaders and reflection #630

Open michaelsembwever opened 5 years ago

michaelsembwever commented 5 years ago

Project board link

It should be possible to run:

mvn -Pintegration-upgrade-tests -Dtest=ReaperCassandraIT

but trying to start the jetty/jersey server via reflection ( https://github.com/thelastpickle/cassandra-reaper/blob/master/src/server/src/test/java/io/cassandrareaper/acceptance/ReaperTestJettyRunner.java#L171 ) throws a class cast exception (around jboss' AutoDiscoverable).

When this is fixed then the upgrade integration tests can be uncommented and run again in both travis and circleCI.

The requirement to the upgrade integration tests is that the previous versions of ReaperApplication (inside a DropwizardTestSupport) is run via separated classloaders. The classloaders need to be child-first, as opposed to parent-first, for this to work. The upgrade integration tests run the cucumber scenarios against the past version of a ReaperApplication test server, and then mid-scenario replaces the ReaperApplication test server with once from the test classpath. The upgrade integration tests ensure all normal Reaper operations complete successfully while Reaper is upgraded.

┆Issue is synchronized with this Jira Story by Unito

gitcoinbot commented 5 years ago

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


This issue now has a funding of 0.998 ETH (137.94 USD @ $138.21/ETH) attached to it.

aahutsal commented 5 years ago

@michaelsembwever I've managed to build successfully (the only thing bower been needed to be installed globally through npm i -g bower. That should be stated somewhere in docs).

Seems your issue explained here: https://stackoverflow.com/questions/37145127/java-lang-classcastexception-org-glassfish-jersey-jackson-internal-jacksonautod

so, going to play with it once @gitcoinbot allow me to bid on

michaelsembwever commented 5 years ago

@michaelsembwever I've managed to build successfully (the only thing bower been needed to be installed globally through npm i -g bower. That should be stated somewhere in docs).

Indeed. We've all been using bower for so long we're forgotten this is an external install. This shortcoming in the docs comes from when we made building the UI a mandatory part of the build in https://github.com/thelastpickle/cassandra-reaper/pull/415

Seems your issue explained here: https://stackoverflow.com/questions/37145127/java-lang-classcastexception-org-glassfish-jersey-jackson-internal-jacksonautod

Nice find. Note this might not be the only issue. Did you manage to get upgrade integration tests running against 1.2.2, 1.3.0, and 1.4.1 ( https://github.com/thelastpickle/cassandra-reaper/blob/master/src/server/pom.xml#L41-L43 ). The output of mvn -Pintegration-upgrade-tests -Dtest=ReaperCassandraIT should show each scenario run against each of these versions. And did you have any trouble setting up ccm so that the jmx calls from the embedded Reaper server were connecting to it? (this is a typical stumbling block for many trying to run the integration tests locally)

so, going to play with it once @gitcoinbot allow me to bid on

this is the first time for me using gitcoin. to my best understanding you have to Request to Work and that will send me an email notification to which I have to approve ??

aahutsal commented 5 years ago

@michaelsembwever this is the first time I"m using gitcoin too. Yeah, I've to Request to Work it seems.

gitcoinbot commented 5 years ago

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


Work has been started.

These users each claimed they can complete the work by 1 week from now. Please review their action plans below:

1) gutsal-arsen has been approved to start work.

@michaelsembwever I've managed to build successfully (the only thing bower been needed to be installed globally through npm i -g bower. That should be stated somewhere in docs).

Seems your issue explained here: https://stackoverflow.com/questions/37145127/java-lang-classcastexception-org-glassfish-jersey-jackson-internal-jacksonautod

so, going to play with it once @gitcoinbot allow me to bid on

Learn more on the Gitcoin Issue Details page.

aahutsal commented 5 years ago

@michaelsembwever I've managed to build and run tests, but it seems fails: image

What I've done is:

  1. pull cassandra:latest via docker pull cassandra
  2. run and bind cassandra to localhost:9042 via: docker run --name some-cassandra -ti --rm -p9042:9042 cassandra:latest
  3. run server via cd src/server; java -jar target/cassandra-reaper-1.5.0-SNAPSHOT.jar server target/test-classes/cassandra-reaper.yaml
  4. run ReaperCassandraIT test via mvn test -Pintegration-upgrade-tests -Dtest=ReaperCassandraIT (by the way you missed "test" goal in your mvn -Pintegration-upgrade-tests -Dtest=ReaperCassandraIT in your message above)

Got the error above. I assume it's some misconfiguration. Since there's no much documentation on how to run the stuff, I need your help.

michaelsembwever commented 5 years ago

@gutsal-arsen correct that's a misconfiguration/incorrect setup of the Cassandra service.

Offline (as I'll be asleep the next 8hrs), the easiest way to help you would be to, look into how both travis and circleCi setups up cassandra using CCM. you can also use a local Cassandra server. but it's important that jmx AND cql connections are established. and taking the ccm route will give you better parity with the CI builds.

You don't need to run the reaper service (what you did with java -jar target/cassandra-reaper-1.5.0-SNAPSHOT.jar server target/test-classes/cassandra-reaper.yaml), as that happens inside the integration tests (see ReaperTestJettyRunner)

aahutsal commented 5 years ago

@michaelsembwever I've build Docker container for tests and trying to reproduce your issue. In the meantime could you explain the reason JMX connection fails on numerous tests attempting to connect to port 7300,7400 at 127.0.0.{2,3,4}? JMX is not running on these IPs, is that a bug or some misconfiguration from at my side again?

Causing: java.lang.RuntimeException: io.cassandrareaper.ReaperException: Failure when establishing JMX connection to 127.0.0.3:7300
at io.cassandrareaper.jmx.JmxConnectionFactory$JmxConnectionProvider.apply(JmxConnectionFactory.java:242)                                      
at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)                                                         
at io.cassandrareaper.jmx.JmxConnectionFactory.connectImpl(JmxConnectionFactory.java:103) 
michaelsembwever commented 5 years ago

For the non-127.0.0.1 nodes you need to set environment variable as LOCAL_JMX=no This is otherwise defined in Cassandra's cassandra-env.sh. Also see https://github.com/thelastpickle/cassandra-reaper/blob/master/src/server/src/test/resources/README.md

gitcoinbot commented 5 years ago

@gutsal-arsen Hello from Gitcoin Core - are you still working on this issue? Please submit a WIP PR or comment back within the next 3 days or you will be removed from this ticket and it will be returned to an ‘Open’ status. Please let us know if you have questions!

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

gitcoinbot commented 5 years ago

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


@gutsal-arsen due to inactivity, we have escalated this issue to Gitcoin's moderation team. Let us know if you believe this has been done in error!

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

aahutsal commented 5 years ago

@gitcoinbot I do have a question. It still shows me 14 days I need to complete that issue. Am I right?@michaelsembwever? If needed urgent solution, why time to complete been set to 30 days? Both, please, clarify.

aahutsal commented 5 years ago

@michaelsembwever I'm assured I can complete. Just stuck with initial setup and snoozed to later time.

michaelsembwever commented 5 years ago

fixed (snoozed) @gutsal-arsen . if you're making progress i'm happy. i don't know how to change the "Time Left" field on the gitcoin page, will ask support.

aahutsal commented 5 years ago

@michaelsembwever looks like it's hard to mimic circleci logic fully, so I'll probably run circleci --local every time testing source changes. Since env preparation takes too much time, may I use Docker layer caching?

aahutsal commented 5 years ago

@michaelsembwever could you please help with that?

image

This happens on circleci local execute --config .circleci/config.yml

michaelsembwever commented 5 years ago

Since env preparation takes too much time, may I use Docker layer caching?

I don't know enough about Docker layer caching. Is it a premium feature? Otherwise, anything to speed up the CI builds would be much appreciated. Can we put this in as a separate issue/pr?

michaelsembwever commented 5 years ago

circleci local execute --config .circleci/config.yml worked for me.

The rat plugin ensures all files have appropriate license headers. If you have added new files make sure they have the license header. Otherwise, check out the contents of /home/circleci/project/target/rat.txt to see what the error is.

aahutsal commented 5 years ago

Docker layer caching means that all dependencies will be downloaded only once (that in theory could greatly speed up build process for next builds). Asking coz running circleci local execute --config .circleci/config.yml without it still takes lot of time due to dependency download.

Haven't added anything. Just merged to current master (that has been updated since last time I forked). I'll try circleci local execute --config .circleci/config.yml on original master and see the difference. Will let you know.

I'm off till Monday.

aahutsal commented 5 years ago

@michaelsembwever I've seems discovered the problem. This actually is not reflection problem, this is configuration problem:

image

Any thoughts?

michaelsembwever commented 5 years ago

This actually is not reflection problem, this is configuration problem.

This is a real problem. Reaper supports backward compatibility of the backend storage, but does not support forward (or any) compatibility of the configuration file. So there's no need for the test to be validating this. I will fix this for you.

aahutsal commented 5 years ago

@michaelsembwever Plz, lemme know what commit to merge into my fork.

michaelsembwever commented 5 years ago

@gutsal-arsen The trick here is for the tests to provide the forward-compatibility on the yaml configuration to tell the jackson object mapper not to fail when deserialising unknown properties… This is done with a call to bootstrap.getObjectMapper().disable(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES). But this has to be done via reflection in ReaperTestJettyRunner.

See https://github.com/thelastpickle/cassandra-reaper/pull/642

aahutsal commented 5 years ago

ClassCastException been caused by different versions of spidiscovery.internal.MetaInfServicesAutoDiscoverable. One version is packaged into cassandra-reaper-x.x.x.jar, another version comes with jersey-metainf-services.jar

Same for validation.internal.ValidationAutoDiscoverable, wadl.internal.WadlAutoDiscoverable, jaxb.internal.JaxbAutoDiscoverable, logging.internal.LoggingFeatureAutoDiscoverable, filter.internal.ServerFiltersAutoDiscoverable

So far, so good. Excluded jersey-metainf-services, jersey-bean-validation and stuck. WadlAutoDiscoverable does not come as separate jar, it is the part of jersey-server, thus can not be excluded. I think at this point you should answer a) what's the reason to include *AutoDiscoverable classes into cassandra-reaper? b) can you re-package cassandra-reaper to exclude them?

Don't think ClassCastException could be solved other way #641

image

michaelsembwever commented 5 years ago

I think at this point you should answer a) what's the reason to include *AutoDiscoverable classes into cassandra-reaper? b) can you re-package cassandra-reaper to exclude them?

We can't change the already previously released versions of cassandra-reaper. This would only be something we could do moving forward. If Reaper still runs ok without it then it should be excluded anyway.

One version is packaged into cassandra-reaper-x.x.x.jar, another version comes with jersey-metainf-services.jar

In the beginning of ReaperJettyTestSupport's constructor can we classload the correct WadlAutoDiscoverable and AutoDiscoverable using the appClass's classloader? Or are some of these classes already loaded in the current classloader by the static/import references? Is the a way to exclude them from the test classpath instead?

aahutsal commented 5 years ago

@michaelsembwever ClassCastException happens somewhere in Dropwizard. We can guess what to load now. But tomorrow there will be another situation and our guess will fail. This is not the solution, man.

michaelsembwever commented 5 years ago

But tomorrow there will be another situation and our guess will fail.

"Tomorrow" depends on us upgrading the version of dropwizard-testing. I'm fine with that limitation. And this is the approach I think I favour. It makes sense too, everything related to the server has to be loaded through the versioned classloader. That implies that those classes can be referenced and loaded in the normal test classloader.

Another approach could be to not load certain packages through the ChildURLClassLoader. For example, always loading AutoDiscoverable from the system classloader. But I don't know if that approach will work or just become a rabbit hole.

aahutsal commented 5 years ago

I've finished my research. Considered 2 ways to accomplish that.

  1. Exclude JARs that contains duplicated classes from surefire classpath. Worked well initially, but stuck at this point. If I keep excluding JARs we get ClassNotFound exception. So, that's not the solution.
  2. Another solution would be to use custom ClassLoader as you started to do, but this is not going to work as well, as Jersey/Dropwizard escapes from the sandbox in some places calling ClassLoader.getSystemClassloader() like it happens here. Once this happens incorrect versions of classes are loaded and we get ClassCastException again.
gitcoinbot commented 5 years ago

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


Work for 0.998 ETH (163.34 USD @ $163.67/ETH) has been submitted by:

  1. @gutsal-arsen

@michaelsembwever please take a look at the submitted work:


gitcoinbot commented 5 years ago

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


The funding of 0.998 ETH (163.5 USD @ $163.82/ETH) attached to this issue has been approved & issued to @gutsal-arsen.

aahutsal commented 5 years ago

@michaelsembwever has anything changed on that issue, Mick?