stryker-mutator / stryker-handbook

A comprehensive guide to the stryker ecosystem
Apache License 2.0
71 stars 11 forks source link

Full report on the stryker-dashboard #27

Closed nicojs closed 3 years ago

nicojs commented 4 years ago

The stryker dashboard (https://dashboard.stryker-mutator.io) supports now hosts your full html report (although it is not fully advertised yet). We should implement this in Stryker, Stryker.NET and Stryker4s.

We can use this issue to discuss and align design choices across Stryker, Stryker.NET, and Stryker4s.

I'll keep this original post up to date with current design decisions.

Changelog

18-11

13-11

13-11

25-10

21-10

Request

You can send a report with an HTTP PUT request, like so:

curl -X PUT \
  $BASE_URL$/api/reports/$PROJECT$/$VERSION$ \
  -H 'Content-Type: application/json' \
  -H 'Host: dashboard.stryker-mutator.io' \
  -H 'X-Api-Key: $API_KEY$ \
  -d '$REPORT_JSON$'

Multiple results per VERSION are also supported using this url: $BASE_URL$/api/reports/$PROJECT$/$VERSION$?module=$MODULE_NAME$

A report can also contain only a "mutation score" (without the files and mutants). This way, you will have a mutation score badge, but no other information is stored. In that case, the $REPORT_JSON$ should be in the format { "mutationScore": 42 }

The variables here are:

Reponse

{
  "href": "https://dashboard.stryker-mutator.io/reports/github.com/stryker-mutator/stryker-net"
}

The href you can communicate to the end-user.

Implementation suggestion

I think we should try to align on the implementation a bit. That way enabling stryker dashboard reports in your builds is the same across all implementations of Stryker. Current suggestion:

Open questions:

nicojs commented 4 years ago

Current work on this in Stryker is tracked by https://github.com/stryker-mutator/stryker/pull/1783

rouke-broersma commented 4 years ago

The implementation suggestion seems very hard coupled on git(hub) and ci/cd using travis. Things like reading environment variables don't really align with how I would build an azure devops pipeline. I'm not a fan of such implicit configuration and would prefer to implement the options explicitly in stryker-net.

nicojs commented 4 years ago

How about adding the STRYKER_DASHBOARD_REPO_SLUG and STRYKER_DASHBOARD_VERSION proposal? Maybe the repo slug can also be in configuration, since that isn't going to change. If we can align on that part only that would be a good start I think.

I agree that it is some work to support all the CI providers, but it is also nice for users if it 'just works' out of the box. For example, setting the STRYKER_DASHBOARD_VERSION to the current branch name can be a hassle (figuring out whether to use the tag or branch env variable for example).

EDIT: you can also have a mechanism in place for pluggable CI providers and wait for PR's to implement them. That way you know to only support the one's you want. That's basically what we're doing with https://github.com/stryker-mutator/stryker/tree/master/packages/core/src/reporters/ci

rouke-broersma commented 4 years ago

What I mean is that the proposal heavily expects that the dashboard reporter is only used from a buildserver, when I expect that currently most of our users do not use stryker-net on a buildserver.

Another thing is that REPO_SLUG is very heavily tied to the way github repo's work and I for example have never used this term. If I came across this in stryker config I would have no idea what to actually do with it. The name seems to be used because currently we use github as the authentication provider so it makes sense. But if we support other types of customers in the future (eg, private projects from enterprise customers in azure devops) then this coupling does not make sense anymore. I would propose something more generic like project name and company name. I also don't really understand why the provider is neccesary. Something like this makes more sense to me:

$BASE_URL$/api/reports/$COMPANY_NAME$/$PROJECT_NAME$/$VERSION$?module=$MODULE_NAME$

Eq: $BASE_URL$/api/reports/stryker-mutator/stryker-net/1.0.0.25?module=Stryker.Core

By binding on the github project structure and by binding on repositories so heavily the user loses a lot of flexibility in how they can organize their reporting.

nicojs commented 4 years ago

Another thing is that REPO_SLUG is very heavily tied to the way github repo's work

I a 'slug' is actually a thing that is used outside as well:

It wasn't my idea to tight it to anything. I just want a string to identify a repo that is url-safe. That's why I choose 'slug'.

It is also more accurate than the "repository name", since the name can have non-url characters in it, but the slug cannot. For example: "Justice League" vs "justice-league".

I've thought about it a lot. I just want one string that is unique for the repository. "Slug" was the name I came up with. It is used heavily in the dashboard. I don't want to name it differently everywhere. So: I'm open to suggestions, but not too open ;)

I would propose something more generic like project name and company name

I think that is less generic. For example, gitlab already supports structuring your repo's, for example: my-team/a/b/c/d/my-api. This supported when you have one field, but hard to support when you have multiple levels..

But if we support other types of customers in the future (eg, private projects from enterprise customers in azure devops) then this coupling does not make sense anymore. I would propose something more generic like project name and company name.

Why doesn't it make sense? The provider (which the first part of the repository slug) is designed for that. It should be pretty straight forward to add gitlab.com, azure.com, etc.

rouke-broersma commented 4 years ago

The point is that as a customer I don't necessarily want to couple my repo 'slug' and my project name together. As a user I do not care that you need url safe names, take care of it for me. I don't want to have my repository name in my url. I'm working on project x for company y. That's what the report is for, that's how I want my report to be available.

I'm not talking about what it's called in the dashboard source code or in the API, that's fine no user cares about that it's an implementation detail. But from the user perspective it does not make sense to think in terms of what the place my code is hosted in calls parts of the url. A user might not know what a slug is and then won't understand what to put as the STRYKER_DASHBOARD_REPO_SLUG.

Why doesn't it make sense? The provider (which the first part of the repository slug) is designed for that. It should be pretty straight forward to add gitlab.com, azure.com, etc.

Let's say theoretically we have a tfs.rhea.infosupport.net or gitlab.rhea.infosupport.net, how do we support those customers? And, why do we care about the provider in the first place? As a customer, the provider has no relevance to my project, it is simply the place where I host my code.

I understand that it is way simpler to be able to offload all organizational structure to github and this works perfectly fine for open source projects. But most closed-source projects will probably (imo) want to organize under some kind of organization structure similar to their company structure and not their scm tool structure.

hugo-vrijswijk commented 4 years ago

As a user I do not care that you need url safe names, take care of it for me

This is what Stryker and Stryker4s already do for the dashboard report. The repo slug is gathered without user input (as seen in Nico's linked code for Stryker above). That's how it's done for all variables except the api key. The suggested environment variable is purely if the user wants to override it.

I'm working on project x for company y. That's what the report is for, that's how I want my report to be available

I am not sure what your suggestion here is. In most cases the project you are working on will correspond to the repo name. There needs to be some sort of identifier, and the way to get it without user input is by using any identifiers of the project we can automatically get.

Let's say theoretically we have a tfs.rhea.infosupport.net or gitlab.rhea.infosupport.net, how do we support those customers? And, why do we care about the provider in the first place?

The provider is simply there so we can have a unique identifier for a project. There's a stryker-mutator/stryker on github, but there might be a different one on gitlab for example. This way they each would get a unique link.

I feel like in general the dashboard reporter is indeed set up to be used from a build server, but if the user wants they can always add the environment variables themselves before running stryker, and it'll 'think' it is running on a build server (though there could be documentation for this).

rouke-broersma commented 4 years ago

This is what Stryker and Stryker4s already do for the dashboard report. The repo slug is gathered without user input (as seen in Nico's linked code for Stryker above).

I never said without user input, I said if the name needs to be url safe fix that for me, but let me choose my own project name unrelated to what my repo is called

I am not sure what your suggestion here is. In most cases the project you are working on will correspond to the repo name

My suggestion is that the project name is not forced to be the repo name (and thus the config variable not be called REPO_SLUG)

The provider is simply there so we can have a unique identifier for a project. There's a stryker-mutator/stryker on github, but there might be a different one on gitlab for example. This way they each would get a unique link.

But why is it called a provider and why is it linked to the tool that contains my source code? I don't organize my work around github or gitlab or azure devops, I organize my work around the people I work with, the team I work in or the company I work for. That is usually also unique.

I feel like in general the dashboard reporter is indeed set up to be used from a build server, but if the user wants they can always add the environment variables themselves before running stryker, and it'll 'think' it is running on a build server (though there could be documentation for this).

Yes but why though? The api design does not need to make this assumption for the CI provider to be able to pick up the correct values. If the user doesn't choose anything and stryker can figure out that it's running on circleci and can request all the information from circleci sure. But as soon as I don't want that, it no longer makes sense to talk about providers and repo_slugs and the like.

richardwerkman commented 4 years ago

To mingle myself into this as well, I think @Mobrockers 's point is that a lot of these variables seem unnecessary. When we look at the dashboard in an abstract way, it can know from the API key what project the json belongs to. When producing the API key a project name and all other info that is needed for a unique identifier is already present. Since the API key itself is unique as well we would expect that this data is still available.

At stryker-net we would like to push our json using an API key and possibly a version number (build number). This also opens the path to private repos as we don't have to know anything about the project itself. Users can just generate an API key and choose a slug themselves.

Why do we need to provide the other variables as well from the client side? Does it have any benefits over getting them from the API key?

nicojs commented 4 years ago

TL;DR;

I suggest the following changes:

If @richardwerkman, @hugo-vrijswijk and @Mobrockers agree I'll update the original issue with this new design.

Long version

Ah ok. I think I understand better where you're coming from. However, I don't like to only use an API key as an identifier. It's a security feature. It's also the way the dashboard already worked, I didn't want breaking changes. We're also thinking of allowing report uploads without an API key, so we definitely need an identifier that is not your API key.

As for the name of the identifier, I agree that slug sounds strange. Also the fact that it needs to be configured via an env variable, while other settings do not. Maybe project is better? The value would have to include "github.com", so it would be "github.com/stryker-mutator/stryker" for example. I would suggest to configure it in stryker settings (or via build server env variables when it's missing). I was thinking of adding a "dashboad": { "project": "github.com/stryker-mutator/stryker" } to our settings. That way its easier to run locally, as well as more in line with what users would expect. For now, it would correspond to the slug, but when we add other providers that do use different names, we could also allow names to be not-url safe (and stryker would handle that part). Does this respond to your complaints @Mobrockers?

As for the version, I think it does not make sense to configure that via settings, as different branches would need different versions, however it would make sence to be configured via a console argument (or buildserver env variable if the console argument is missing). Does this sound better?

Is for tfs.rhea.infosupport.net or gitlab.rhea.infosupport.net, I was thinking of using the same naming strategy. For example: "gitlab.rhea.infosupport.net/kenniscentrum/kc-cli".

But most closed-source projects will probably (imo) want to organize under some kind of organization structure similar to their company structure and not their scm tool structure.

In organizations I worked at, the SCM structure always corresponded to the organization structure. For example "gitlab.internal-domain-name.wds/department/group/project/folder1/repo1.git"

rouke-broersma commented 4 years ago

Yes, that addresses my concerns. I also agree that it doesn't make sense for the version to be set in config, I more meant that it would be nice to give the user the freedom to decide their versioning scheme instead of always using branch /commit.

nicojs commented 4 years ago

I've updated the design. Let me know if anything else needs fixing.

richardwerkman commented 4 years ago

Seems fine by me!

rouke-broersma commented 4 years ago

Lgtm! Awesome :)

hugo-vrijswijk commented 4 years ago

I lllllike it!

nicojs commented 4 years ago

Ok. I'm implementing this in Stryker 👍

Thanks for the help. I think the design improved a lot from my 'straw man's syntax proposal'

nicojs commented 4 years ago

Added "'mutation score'-only reports.", see design in the first post

nicojs commented 4 years ago

Guys, I'm thinking of making what you report configurable. Right now, the dashboard supports either a mutation score or a full report (see my update from yesterday).

This is there for compatibility reasons, but I also think it is a nice feature to support. For example users that have really big projects (which might give problems when we're publishing the report), or users that are anxious about sending their source code might choose to not send a full report.

I was first thinking of adding a boolean flag: dashboard.fullReport. Which could default to true. However, we might in the future add more report types (for example report mutants without source code?). So now I'm thinking of an enum: dashboard.reportType. It can have two values mutationScoreOnly or full.

What do you think?

nicojs commented 4 years ago

I've documented the user guide here: https://github.com/stryker-mutator/stryker-handbook/blob/master/dashboard.md

It is still rough, needs to be improved.

The idea is that you can refer to that page from your readme, instead of explaining the settings in each implementation.

rouke-broersma commented 4 years ago

Guys, I'm thinking of making what you report configurable. Right now, the dashboard supports either a mutation score or a full report (see my update from yesterday).

This is there for compatibility reasons, but I also think it is a nice feature to support. For example users that have really big projects (which might give problems when we're publishing the report), or users that are anxious about sending their source code might choose to not send a full report.

I was first thinking of adding a boolean flag: dashboard.fullReport. Which could default to true. However, we might in the future add more report types (for example report mutants without source code?). So now I'm thinking of an enum: dashboard.reportType. It can have two values mutationScoreOnly or full.

What do you think?

This is a setting on the implementation side not on the dashboard side right?

nicojs commented 4 years ago

This is a setting on the implementation side not on the dashboard side right?

Yes, It would add a dashboard.reportType to the "implementation suggestion" section.

nicojs commented 4 years ago

I just had a nice talk with @Mobrockers. Right now, a mutation testing report has 2 options:

  1. A mutation-score-only report
    { "mutationScore": 42 }
  2. A "full" report
    { "result": "$REPORT_JSON$"

Would it be better to integrate the mutation-score-only report into mutation testing elements itself? So { "schemaVersion": 1.5, "mutationScore": 42} would be valid as well as { "schemaVersion": 1.5, "files": {}, "thresholds": { "high": 80, "low": 60 } }.

That way the dashboard can handle them both uniformly. It would also clean up the API, not enforcing users to add a wrapper object. It would also mean that the curl request is simpler, just post a JSON file with the result, instead of forcing a wrapper object.

This would mean a new minor version of the schema, as well as some rework in the dashboard. Both are small changes IMO. It would be breaking for the dashboard API, so I would like to do that before we officially released it.

@simondel @hugo-vrijswijk @richardwerkman thoughts?

richardwerkman commented 4 years ago

The json looks good to me! What if someone passes both "mutationscore" and "files"? What would the elements take as the source?

nicojs commented 4 years ago

Files will always have precedence. It would actually also help out with the coloring of the mutation score badge, since the thresholds could still be filled even if only a mutation score is known.

If others also agree I'll update the design.

EDIT: I've updated the design

hugo-vrijswijk commented 4 years ago

If mutation score only is supported in the schema, it would have to be changed everywhere. I'm not sure it really makes sense to have a html report with only a mutation score, for example. Same with the metrics. That means all of a sudden all the metrics are either nullable, or aren't fully compliant with the full json schema report

nicojs commented 4 years ago

Yeah, I see what you mean. It is essentially invalid to call calculateMetrics on a mutation score only report.

I think it's better to implement it as an exception in the dashboard. However, I am happy to remove the top-level result property (as proposed by @Mobrockers ).

I'll update the design again, happy to hear additional remarks.

nicojs commented 4 years ago

I've implemented this in the dashboard https://github.com/stryker-mutator/stryker-dashboard/pull/131

Will merge later today / this evening

EDIT: You can view / test it out on acceptance: https://stryker-dashboard-acceptance.azurewebsites.net/

nicojs commented 4 years ago

I changed one bit of the design. Apparently this URL: reports/github.com/stryker-mutator/stryker/chore%2Frun-stryker-in-ci

Is being routed (by Azure) to: reports/github.com/stryker-mutator/stryker/chore/run-stryker-in-ci

Allowing chore/run-stryker-in-ci is more user friendly as well, so I think allowing this is the way to go.

Instead of assuming that the last part of the slug is the version, we now assume the first 3 parts of a slug is the project name, rest is the version. This might have to be reworked a bit once we allow non-github projects, but I think its worth the work. This is implemented in https://github.com/stryker-mutator/stryker-dashboard/pull/132. This issue is updated.

hugo-vrijswijk commented 4 years ago

I believe this issue can now be closed?