Overview on strategies for FIs to setup an open source contribution workflow

MOVED TO GITHUB DISCUSSION - https://github.com/finos/open-developer-platform/discussions/163

Feature Request: We'd like to request the collaboration to put a minimum infrastructure and processes in place in order to contribute to existing open source projects.,

Problem description It could be hard to give the first step towards contribution when there are many moving parts (e.g. tooling, security controls, data leakage controls, compliance validation on IP, access control, corporate policies and procedures, etc.).

Benefits This minimum Contribution MVP could be helpful for companies initiating their Open Source Program Office. That way, companies can start contributing in a smaller and more controlled environment as they continue to evolve with their processes and tooling, adapting the best practices to their own reality and by learning from their mistakes.

Potential Solutions: Find below a few questions that could help define the MVP:

How to control who can contribute to specific projects on public repos? Should we create proxy rules specific to access white listed repo URLs? As of today, no developer has access to public repos;
What steps should we take before whitelist a repo to be added to the white list?
Which controls should we implement in order to allow contribution directly from the developer's machine?
How to implement a simple yet effective compliance control? Should we use a git proxy whenever there's a pull request being sent to the public repo? Should we start with manual validation for IP and Data Leakage?
Should we create an internal repo mirroring the public, just for the purpose of contributing?
What are the best practices to create a corporate github account for contribution before we allow developers to associate their github account to our corporate account?

Dear Junji,

thanks for following up on the conversations we had in the ODP call.

I'm going to answer to your items, and I'd like to ask @bingenito (and other players from financial institutions) to chime in and add more colour.

I'd also propose a new title for the issue, something along the lines of Overview on strategies for financial institutions to setup an open source contribution workflow. I fear that MVP would set the wrong expectations, as we're not in the place yet to think about a "product" that solves this issue. WDYT?

Before addressing the items below, I think it's appropriate to identify the 2 main strategies that we've seen financial institutions adopting in order to contribute to FINOS.

Proxy - Setting up a proxy that org developers must use in order to reach (HTTPS and SSH) endpoints on github.com ; the proxy rejects or approves Git operations, based on checks and validations in place to comply with corporate regulations and bylaws
Syncing - Setting up an internal Git server (behind corporate firewall) and define automation to sync branches across the corporate Git server and github.com/finos/* ; when branches are public, authors must submit a PR against master and the project maintainers will review (and possibly approve) it.

Choosing one or the other strategy is something that heavily depends on the organization setup, org chart, identity management, security and compliance processes and much more.

The majority of FINOS members are using an internal Git server, given that most of their development activity is internal, however, we see many of them facing technical challenges trying to adapt this strategy in order to contribute to open source.

Few FINOS members use the proxy strategy to contribute to some OSS projects (while keeping an internal Git server for internal developments), which delivers a more dev-ergonomics workflow.

How to control who can contribute to specific projects on public repos? Should we create proxy rules specific to access white listed repo URLs? As of today, no developer has access to public repos.

The proxy approach allows to easily implement repo whitelisting. By integrating the proxy with the internal's firm IdM, it's quite trivial to extend whitelisting to Git repositories and authors.

The sync approach can allows to control access on the internal Git server (easily configurable with an internal IdM like LDAP).

What steps should we take before whitelist a repo to be added to the white list?

It would be good to have a general assessment of the project, below is top 5 items that we deliver in our project (documentation) template.

Ease of use - easy, simple steps to get started
Heartbeat - # of contributors/contributions, last contribution, issues response time, roadmap deployments, community resources, ...
Security, quality and legal checks - metrics, reporting and automation around these aspects
How to contribute - How to submit my first issue or Pull Request
Meet the team - individuals and orgs behind the project, team meetings schedule

Other than that, unless there's a question whether an activity on this repository would hurt the firm, I don't see any reason to hold off.

Which controls should we implement in order to allow contribution directly from the developer's machine?

Implementing controls on developer's machine is a risky approach, as it limits the scope of controls only when the developer works from that specific workstation; for this reason, controls tend to be implemented server-side.

How to implement a simple yet effective compliance control? Should we use a git proxy whenever there's a pull request being sent to the public repo? Should we start with manual validation for IP and Data Leakage?

The answer to these questions passes through the understanding of the 2 approaches we introduced above, proxy or syncing. Proxy definitely seems to be the most dev-ergonomics workflow.

The Git syncing approach requires a specific syncing configuration for each repository/team, in order to connect the internal with the public repo. Below is a development workflow that can be configured using this strategy:

Developers work against the internal Git repository, using branches to develop their features, which are eventually submitted for review and approval
The review and approval process runs automatic and manual checks; if passed, the branch gets renamed (ie, adding a finos_ prefix)
A script runs every x hours and adds all finos_ branches into the github.com/finos/<repository name>.git endpoint; it also pulls changes from the finos upstream master branch
When the finos_ feature branch is available, the author can submit a Pull Request against the master branch.

Starting with manual validations seems like a very good way to start working on real scenarios and validate requirements.

Should we create an internal repo mirroring the public, just for the purpose of contributing?

I see mirroring as a way to implement the syncing approach; I cannot recall any member or other contributor that have used it. The mirroring feature is normally provided by Enterprise Git Servers deployed behind corporate firewalls, so it depends on the product if/how they support this feature and how to address Identity Management (mapping internal Git users with GitHub usernames) and controls (IP, compliance, security, quality)

What are the best practices to create a corporate github account for contribution before we allow developers to associate their github account to our corporate account?

Haven't seen many best practices around creating corporate GitHub accounts (/CC @jonesbr), but I'd suggest to request:

Full name
Affiliation with the GitHub org representing the employee's firm
A picture
A short bio, link to linkedin.com

Possible outcomes from this initiative:

[ ] Best practices
[ ] FAQ
[ ] flowchart decision tree
[ ] present to OSR

@junjikatto - did you have the chance to review my comment above? Would you be able to join the ODP meeting tomorrow and continue this conversation? Or you think that the content provided above is sufficient for you to move forward?

Thank you!

Hi @maoo ! Thank you so much for all the comments and how well you structured the answers! I was reviewing the suggestions (I changed the name by the way) and I also had a call with @jomarsilvazup this past week to talk about it. I wanted to expand a little bit on some items, specially the opportunities to demutualize some things:

(i) Proxy configuration - e.g. share all the events that need to be blocked, mention Citi's Git Proxy initiative, etc.; (ii) Compliance code review - e.g. Citi's Git Proxy initiative could also be used for that since it has a approval process; Google mentioned on the Open Source Readiness presentation that they are interested in collaborating on compliance verification code; (iii) Minimum policy and procedure checklist - what should be the bare minimum that would satisfy some of the FIs concerns;

We don't have to take the whole meeting to talk about these items, but I'd like to expand on some of them.

Best,

Thanks @junjikatto , interesting feedback, comments below....

(i) Proxy configuration - e.g. share all the events that need to be blocked, mention Citi's Git Proxy initiative, etc.;

Agreed; let's add the following to the Proxy strategy section:

an initial list of Git Operations (commit, push, delete, ...)
checks/validations (repo whitelisting, author whitelisting, security scanning, legal scanning, manual approval for compliance code review, etc)
a mention to the GitHub proxy initiative

(ii) Compliance code review - e.g. Citi's Git Proxy initiative could also be used for that since it has a approval process; Google mentioned on the Open Source Readiness presentation that they are interested in collaborating on compliance verification code;

Agreed, I've mention it as a check/validation above. Re. OSR, I'll discuss with @copiesofcopies and @bingenito about how we can bring OSR into this conversation.

(iii) Minimum policy and procedure checklist - what should be the bare minimum that would satisfy some of the FIs concerns;

That's a good question for FIs; based on your experience so far, what would be the bare minimum that would satisfy ITAU's concerns?

We don't have to take the whole meeting to talk about these items, but I'd like to expand on some of them.

I wouldn't mind taking all the meeting for this conversation, but let's see how the meeting evolves.

I'd also like to propose, based on some feedback from @bingenito in the last ODP call, to introduce a third strategy called "Firewall Rules", which is similar to the proxy, but identifies checks and validations at firewall level; to my knowledge, this solution doesn't allow to implement (manual) compliance code review workflows, as it doesn't allow to store and forward Git operations, but I may be wrong.

Finally, I'd like to propose the use of https://github.com/finos/open-developer-platform/discussions to host both contents and comments of this discussion.

Closing this issue, as all content have been migrated to https://github.com/finos/open-developer-platform/discussions/163 - let's continue the conversation there.

Thanks a lot @junjikatto for starting this thread!

finos / open-developer-platform