juju-solutions / layer-cwr

Layer for building the Juju Jenkins CI env
Other
2 stars 5 forks source link

Support publishing the test result to S3 bucket. #99

Open seman opened 7 years ago

kwmonroe commented 7 years ago

Looks like the cwr-charm-commit action gained s3 support in cwr-70. As part of this issue, please document these new action params in the bundle:

https://github.com/juju-solutions/bundle-cwr-ci#workflows

kwmonroe commented 7 years ago

Ya know, the more I think about this, the more I don't like s3 options per action. It adds to the already-ridiculous number of action params.

What do people think of adding a new action (e.g. "cwr-push-results") that takes creds and a location and simply pushes /srv/artifacts wherever the user wants?

The immediate objection would be that you'd have to run a new action every time a job finishes. To that, I say follow the current pattern and make a jenkins job that corresponds to this new action. Then we could use the already-enabled Post-Build Script plugin to run this after any job is completed.

seman commented 7 years ago

Are you thinking about replacing the S3 test results with the test results we have in Jenkins? This does not allow us to have multiple cwr-ci bundles writing to a single S3 location.

kwmonroe commented 7 years ago

Negative @seman. I was thinking of simply pushing the entirety of /srv/artifacts to s3 (or wherever a user wants). These are not necessarily the same artifacts as you'd have from Jenkins.

This does not allow us to have multiple cwr-ci bundles writing to a single S3 location.

How about a subdir for the s3 location with whatever the user wants? I think this would be nice:

$ juju run-action cwr/0 cwr-push-results location=s3:///mybucket/$HOSTNAME creds=/path/to/s3/creds.txt

And then s3 would look like this:

$ aws s3 ls --recursive s3://mybucket/ | grep <cwr-host>
2016-05-06 13:15:13      38455 <cwr-host>/cwr_bundle_hadoop_processing/index.html
2016-05-06 13:15:13      19565 <cwr-host>/cwr_bundle_hadoop_processing/index.json
...
johnsca commented 7 years ago

That does seem like it would be cleaner and would ensure to pick up all files, including those created by tools other than CWR, such as Matrix logs or crashdump files.

seman commented 7 years ago

This implies multiple report pages: per host and per bundle/job. Need to generate a single report page with multiple cwr writing to it.

kwmonroe commented 7 years ago

@seman i don't understand. Just sync /srv/artifacts to a remote location however you (the user) wants. If you want it separated by host, add a $hostname to the s3 location. If you want a single report, just use the top-level bucket as the location.

johnsca commented 7 years ago

I have to side with @kwmonroe on this. It seems like it would be really simple to use the aws CLI's sync, something like:

aws s3 sync /srv/artifacts s3://$bucket/$optional_prefix
seman commented 7 years ago

Let's say we have multiple machines (multiple cwr-ci) generating /srv/artifacts/index.html file. Machine #1 has test result for wordpress in /srv/artifacts/index.html and the other machine for wiki-simple. How is this going to sync to s3://$bucket/srv/artifacts/index.html? One machine replacing the test result of the other machine?

kwmonroe commented 7 years ago

@seman you've made the case for a hostname prefix, so you'd have:

$ aws s3 ls --recursive s3://mybucket/
2016-05-06 13:15:13      38455 <machine-1>/cwr_charm_commit_wordpress/index.html
...
2016-05-06 13:15:13      19565 <machine-2>/cwr_bundle_wiki_simple/index.html
...

Then the onus is on you (the person that wants to display these results) to include a top-level index that says:

$ aws s3 cat s3://mybucket/index.html
<a href=machine-1/index.html>Machine 1 cwr results</a>
<a href=machine-2/index.html>Machine 2 cwr results</a>
johnsca commented 7 years ago

I've thought about this some more and want to at least note down my thoughts.

As the one who originally added the S3 support directly into the cwr tool, I find my self reversing my position at this point. I do think that it's important that the tool be fairly self-contained to accomplish what it is intended, so that we can offer it to third parties with the assertion, "This is how we recommend running your tests, and it will generate nice reports for you." To that end, having the tool be able to write its data directly to S3 makes sense. However, that was before the components started generating their own additional files that we want to capture that are hard to make cwr aware of, and before we started shifting toward recommending that the charm be used in favor of the CLI tool directly.

So, I think at this point, for the use-case within the charm, a post-build sync would ensure that we capture all of the artifacts that we care about in a slightly cleaner way while still allowing us to tell people, "Just deploy this for testing just like we do, and if you want your data on S3, just include this config." (Config, action, conjure-up step, whatever.)

We do still have an issue with scaling the Jenkins workers, which I think might be related to what @seman is asking about and isn't really addressed by @kwmonroe's reply, but we have that with or without S3 in the mix and maybe looking at that problem from a more generic "shared storage" approach would be useful. We're also leaving the CLI-only experience with cwr lacking in that it won't capture all artifacts to S3, so we may need to address that at some point.

arosales commented 7 years ago

fwiw, +1 to @kwmonroe and @johnsca thoughts and looking for an approach to keep this open-ended to other data stores that come up from a community perspective. It feels a config option in the charm may also be a good avenue to investigate with a default value, and options to grow.

Reference https://github.com/juju-solutions/layer-cwr/pull/122

ktsakalozos commented 7 years ago

+1 to the approach @kwmonroe and @johnsca suggest. I'd like to add another view to that approach.

CIs often produce build artefacts that you want to preserve for later use, eg v1.5.5 binary of my_app. If one wished he could put under /srv/artefacts those binaries/artefacts and with the action we are discussing here these build binaries/artefacts would be safely stored on S3. Also, now its S3, tomorrow it might be some other storage solution, this action is an extension point we can leverage later.