EclipseFdn / open-vsx.org

Source of open-vsx.org
https://open-vsx.org/
Eclipse Public License 2.0
135 stars 40 forks source link

New temporary GH application activation - ECA beta #1808

Open autumnfound opened 1 year ago

autumnfound commented 1 year ago

Hello open-vsx maintainers! I work with the Eclipse Foundation, and as a member of the eclipsefdn organization I wanted to let you know that a roll-out will be occurring in this repository for a new GH app. This application is a beta version of the ECA check application that is installed in all of our organizations and should work alongside the current check.

We will be enabling this later in the week, and I'll let you know again in this issue once it's been enabled! If you see any issues with the secondary check, please let us know! It should be the same as the main check, but we're doing some final testing before the general rollout. If there are any discrepancies, you can just look to the main check and go by that! It's still considered the source of truth in this case.

cc @kineticsquid @amvanbaren

kineticsquid commented 1 year ago

@autumnfound This is my ignorance, what does this app do and what are the use cases that exercise the app?

@chrisguindon

autumnfound commented 1 year ago

@autumnfound This is my ignorance, what does this app do and what are the use cases that exercise the app?

@chrisguindon

The app is used in project repositories to help enforce the signature of the ECA. In pull requests, any EF-managed org will have the app installed. Its function is to check the ECA signature status for people who commit/contribute code and report on it. An example from one of the current pull requests, the app status can be seen in the checks:

image

If there are users who don't have an ECA signed or one can't be found, then the status will fail and indicate there are problems. This repo is only tangentially related to the Open VSX project and isn't required to follow the guidelines set out in the committers handbook and can safely ignore the check in this case.

kineticsquid commented 1 year ago

@autumnfound Thanks, I understand the purpose. Isn't there already a check in place? How does this relate to that?

Re:

This repo is only tangentially related to the Open VSX project and isn't required to follow the guidelines set out in the committers handbook and can safely ignore the check in this case.

Currently, for this repo, is an ECA not required?

autumnfound commented 1 year ago

@autumnfound Thanks, I understand the purpose. Isn't there already a check in place? How does this relate to that?

There is! That check currently runs in our PHP/Drupal stack and is being migrated to our Java API which takes care of the central ECA logic. The new app hits the new migrated endpoint, while the current one still hits the soon-to-be legacy version.

Currently, for this repo, is an ECA not required?

From what I understand, this repo doesn't require an ECA as it isn't directly related to the open-source project code. It's still present as the check exists for all repositories under our managed organizations, as we don't filter which repos get the check at this time for either the legacy or new ECA status apps.

amvanbaren commented 1 year ago

@autumnfound Ok. so for a limited period of time there will be two ECA checks, the current one and the new beta version?

autumnfound commented 1 year ago

@autumnfound Ok. so for a limited period of time there will be two ECA checks, the current one and the new beta version?

That's the plan yes! We wanted to do some final testing internally before we asked community organizations to participate, and this organization was the best candidate for us. Once we roll out the update to the main application, we'll be removing the beta app to set everything back to a clean slate.

autumnfound commented 1 year ago

The application is now added to the organization! You'll see a new status check for eclipsefdn/eca-beta in PRs that is from the new app. If you notice any problems, please drop a message in this thread! We'll be keeping this active for a little while but will be removing it in the next month or two. Just to keep any conversations in one place, I'll keep this issue open until we remove the beta app

amvanbaren commented 1 year ago

@autumnfound The eca-beta check fails when I create a PR from main to production. https://github.com/EclipseFdn/open-vsx.org/pull/1869

autumnfound commented 1 year ago

@autumnfound The eca-beta check fails when I create a PR from main to production. #1869

Technically, the failure is correct, in that you don't have an ECA on file. It looks like the reason the current version passes is that it's less strict around the ECA signature for non-project repositories. @chrisguindon should we take this opportunity (the migration) to make the checks more strict, or should I relax them to the previous state?

amvanbaren commented 1 year ago

@autumnfound It takes the email of the PR creator and uses it for each commit in the PR, as if the PR creator authored each commit.

autumnfound commented 1 year ago

@autumnfound It takes the email of the PR creator and uses it for each commit in the PR, as if the PR creator authored each commit.

The code in question pulls the commits directly from Github and transforms them into the format my service uses, and it doesn't have context of who made the PR. I looked into your PR and it's your account that is registered as the committer, and I don't see any other user

chrisguindon commented 1 year ago

@chrisguindon should we take this opportunity (the migration) to make the checks more strict, or should I relax them to the previous state?

This is the data from Github for the pull request in question: https://api.github.com/repos/EclipseFdn/open-vsx.org/pulls/1869/commits

This is an interesting use case. It's correct that the @hotmail.com account that Aart is using does not have a valid ECA on file.

I took a closer look at why the old ECA is passing and I believe it's because we check the author data from the API and we do a reverse lookup using the GitHub ID. By doing so, the service is able to figure out that it's https://api.eclipse.org/github/profile/amvanbaren which has a valid ECA on file.

In a way, both are correct. The question is now, which one do we prefer:

  1. The old ECA searches for users using the GitHub ID, not the email when it's available to reduce chances of an ECA falling but would allow a commit from an email that we might not have on file.

  2. The new ECA which validates only using the email address used in the commit log. This is more strict but in a way more accurate.

I would then go with option #1 since we also support private emails from Github when validation ECA.

@waynebeaton WDYT on how strict we should be with the ECA?

autumnfound commented 1 year ago

I would then go with option https://github.com/EclipseFdn/open-vsx.org/issues/1 since we also support private emails from Github when validation ECA.

As a clarifying point, if someone uses a noreply email it would detect the GH user properly and use that for a lookup. This code is just way more strict and might produce false negatives in its current state.

waynebeaton commented 1 year ago

WDYT on how strict we should be with the ECA?

What we care about is whether or not the individual is covered by the ECA. If you can do this with confidence independent of the actual email address used, then we're good-to-go.

Do note that this has some downstream impact. The current (and future) metrics gathering system already takes GitHub noreply email addresses into consideration and correctly attributes commits using these email addresses to the corresponding contributor when that contributor in our database. But other email addresses that are not somehow connected to a contributor will not be mapped back to the contributor.

Do we have some way of tracking how frequently the ECA checker approves a commit when the email address of the author is not on file? My suspicion is that this is a relatively small problem/opportunity.

autumnfound commented 1 year ago

What we care about is whether or not the individual is covered by the ECA. If you can do this with confidence independent of the actual email address used, then we're good-to-go.

Chris and I talked about this, and I think the solution here is going to be a slight upgrade so that it checks the GH user if available first, and then do the email lookup logic. Best of both worlds to gain some fallbacks and make it more robust!

Do we have some way of tracking how frequently the ECA checker approves a commit when the email address of the author is not on file? My suspicion is that this is a relatively small problem/opportunity.

We track failures, and we track the emails associated with requests, so we'd be able to piece this together in the future. There would be associated work to make the data useful by looking up users and filtering, but I think there is some ability to dig into users who fail to authenticate. We tend to be more aggressive in the ECA so for anything that is associated with projects we don't pass the commits for users who can't be determined