canada-ca / OS-Advisory_Conseil-SO

Open Source Advisory Board - Conseil consultatif du logiciel libre
Apache License 2.0
39 stars 26 forks source link

Comments/Suggestions on gc source repo page #101

Open obrien-j opened 5 years ago

obrien-j commented 5 years ago

Source document: GC Source Code Repo analysis

1: Suggest reframing the Options Analysis section, primary question internal v external and the various options to be inline with the Technology arch from Annex C of ref.1, specifically Section 2.3.11[1,2,3] Use Cloud First, also taking into consideration the potential need for multiple instances due to classification or network restrictions.

Order by (as an example):

Given that C.2.3.8.3 of ref 1, annex C, also directs that all code written by government must be released in an open format, I would suggest that majority of code will fall into the 'unclassified/not protected' bucket, and would fit quite easily into a public cloud, SaaS based model (of which there are several good options.)

Also noting that an exemption path is noted, in C.2.3.9.5 of ref1, annex C "Share code publicly when appropriate, and when not, share within the Government of Canada", with 'where appropriate' yet undefined, there will obviously be a need for a 'protected option in the options analysis above. I would strongly recommend that departments or teams that feel they have source code, or potentially network based restrictions on development + access to data, that fit into this bucket make their concerns known. Ensuring that the developer experience for people working in multiple networks/systems is seamless will go along way towards minimizing friction.

refs: 1: Directive on Management of IT

gcharest commented 5 years ago

Good idea, will use the ref arch standards as well as the business requirements checklist identified in the same folder of the repo.

Thanks @obrien-j !

CalvinRodo commented 5 years ago

I figure I'll add some suggestions here rather then open a new issue.

I'd like to add to the Functional and Non-Functional Requirements the requirement that the solution provide an API for introspection as well as interaction.

For instance both GitHub, GitLab, and BitBucket all provide this functionality to some extent.

This will allow us to build tools that can ensure compliance such as what CDS is doing with their Symmorfosi project.

It will also allow us more novel and potentially easier ways to administer the sites for instance using tools like Terraform to handle administering and configuring the solution as code.

Examples of SaaS/Cloud solutions with APIs https://developer.github.com/v3/ https://docs.gitlab.com/ee/api/ https://developer.atlassian.com/bitbucket/api/2/reference/

britthurley commented 5 years ago

Just wanted to chime in to agree with @CalvinRodo on the importance of exposed APIs. The current ecosystem for developers across departments is collection of on-prem self managed of tools for different needs, and a barrier to adoption will be integration with existing tools that people currently rely on.

As I am just jumping in, I am wondering if we're scoping this to enabling the use of CD functionality?
There's specific mention about CI, but making CD a focus would be a big enabler for a lot of existing dev teams. So, things like ensuring we can open flows to end state SSC data centers - whether that be because the solution is hosted there or SSC has gotten on board to support that. If its out of scope, ignore me!

gcharest commented 5 years ago

OK, Will setup a quick call around this topic either today or tomorrow.

MikeNwin commented 5 years ago

Suggest reframing the Options Analysis section ... Order by (as an example):

  • Public Cloud
    • SaaS
    • PaaS
    • IaaS
  • Hybrid Cloud
  • Private Cloud
  • non-cloud onprem

A suggested format could be a table--e.g.:

Option SaaS PaaS IaaS Hybrid Cloud Private Cloud On-premises
A ✔️
B ✔️ ✔️
C ✔️ ✔️

Which would enable adding more criteria columns--e.g.:

Option SaaS PaaS IaaS Hybrid Cloud Private Cloud On-premises REST API GraphQL API Webhooks
A ✔️ ✔️ ✔️ ✔️
B ✔️ ✔️ ✔️ ✔️
C ✔️ ✔️
handshape commented 5 years ago

The format proposed by @MikeNwin makes sense to me as a a way of classifying at-a-glance. A way to flag which criteria are mandatory would be nice, too.

The importance of a clean exit path can't be overstated in this decision, IMHO -- the source will almost certainly count as RBVs from the standpoint of the Directive on IM. Another spot that might bear some policy wrangling will be disposition authority; commits are forever, but disposition authorities are not. It would probably make sense to first look at what's involved in getting a disposition authority with a very long lifespan before chasing a technical solution. Looking forward to chatting this one over.

gcharest commented 5 years ago

Hi folks!

I think the breakdown in #102 of the scope can also be used with the proposed format for an initial discussion.

Before even getting to the technical options/reference architectures, we need to understand what specific requirements are for each scoped use cases.

Zulban commented 5 years ago

Glad to join the chat! My thoughts:

As I am just jumping in, I am wondering if we're scoping this to enabling the use of CD functionality?

1) CI is a first step to CD. CD should be the final goal, and enabled by the platform. I strongly recommend we take inspiration from this excellent paper software development at Google which was recently number one on Hacker News. Our platform solutions should enable at least half of that workflow (section 2).

2) Search engines like Google should be able to index our repos (and READMEs). Currently, much of the science GitLab is not indexable. Which leads us to:

3) Open repos by default. It is currently a burden getting my science GitLab repos to be truly public. This means that 98% of users will not bother (or even realize it). If a user is managing confidential information, it should be their responsibility to flag it as such. We cannot expect 99% of users to flag their repos as public - even if it's as easy as one button. Default behaviour rules all.

4) If a developer accidentally commits and pushes confidential information to a public repo, they need some way to purge that commit. So the platform needs to allow destructive, force commits. CI can help with this: for example, we can recommend standard CI templates which scan for accidental commits of telephone numbers. You then get an email "hey, do you realize there is a phone number in your commit, oops! click this button to purge."

5) One of our highest priorities should be platform agnostic solutions. For every feature, we should identify if it works on Windows, Mac, and Linux. If it doesn't work identically on all three, we shouldn't even consider it a feature. This is complicated - as sometimes things look the same but on the backend they are implemented differently for different platforms. For example: "there is a Skype client for Linux... for now". If a major core library needs to be built from scratch to support all platforms then that feature is fragile and will not support future platforms. This is a very easy to understand requirement which has a ton of good implications on software longevity and breaking out of vendor lock in. Works on all platforms or it is not a feature.

6) The platform must very clearly distinguish at least three classes of user:

This role needs to be made clear for all comments, commits, threads, everything. Otherwise we're opening ourselves up to a ton of stealth influence from full time salespeople and communicators. I checked manually, for example, each user in this thread before writing here because I wasn't sure who was from where. I simply won't trust using a platform that does any less.

7) Permanent links to repos - domain controlled by GoC. When we provide a link to a repo it should remain permanent for as long as the GoC wants. No one but the GoC should have control over changing our URLs, or breaking them, or adding some "/repo/" prefix to the URL when they make a new update to their platform.

schindld commented 5 years ago

For the Functional/Non-Functional Requirements section, the solution(s) would need connectivity to (or built-in) container registries to be useful for many of us. And ssh access from off-GoC network would be nice for protected/non-classified repos.

CalvinRodo commented 5 years ago

So I'm reading some of the comments @gcharest is the goal to have a single solution that provides both inner-sourcing and open source hosting?

Is there even anything that would currently allow that without it becoming a nightmare to administer, my view would be two separate solutions something similar to GCCode for inner sourcing and a separate solution for Open source projects. Which I was under the impression we were just going to use existing platforms, places that already have large open source communities. Is that no longer correct and we are looking at a centralized location for all GoC Open Source Projects?

As for @Zulban suggestion for a way to purge commits I do agree with that and in fact that's something we can do with git so :+1: for that, but if as @handshape stated that "the source will almost certainly count as RBVs from the standpoint of the Directive on IM." allowing for users to purge the repository of commits could cause some concerns if there is not a backed-up copy of the source code that records those purged commits. Although I'm definitely not an expert on Records of Business Value so maybe it's not considered IRBV and so not an issue, so maybe we need to get some clarification on that point from someone more knowledgeable of that domain.

gcharest commented 5 years ago

@CalvinRodo the scope is part of the discussion for the reference architecture in my humble opinion. The initial ask was for recommendation for a GC-wide VCS. And it was followed by a question around whether this should be for both internal and external repos.

I think part of the exercise needs to include this exact discussion. I've thus broken down the high level requirements in a more distinct series of requirements to support the varying legal and policies constraints.

Really happy with the feedback by the way. Lots of very interesting points.

MikeNwin commented 5 years ago

So I'm reading some of the comments @gcharest is the goal to have a single solution that provides both inner-sourcing and open source hosting?

Is there even anything that would currently allow that without it becoming a nightmare to administer

GitHub Connect helps to simplify this by providing a unified platform to enable inner-source on GitHub Enterprise and open source on www.GitHub.com.

Zulban commented 5 years ago

A unified solution is an absolute requirement in my opinion. Choosing whether a project is private/public should be a choice on a platform, not a choice of a platform. I could go on a spiel explaining why, but I think I'll just say that if we split the platform, the default will drift towards never using the public one (or using it only as a repo file dump and not actively work).

Zulban commented 5 years ago

Something else worth mentioning: at the Canadian Meteorological Centre we have unfortunately deployed our own GitLab instance (in addition to existing ones like SSC). The justification for this is so that it has 24/7 support, as many of our operations require 24/7 support and monitoring. We therefore have a duplicate platform with fewer resources supporting it (evidenced by its version being out of date).

I am still investigating, but I can't imagine how our platform with fewer man-hours supporting it could possibly be more stable and "24/7" than the SSC one. Both have had outages.

So @gcharest it may be a good requirement for this platform to say it has 24/7 support. That would help me eventually convince our groups to migrate to it instead of fracturing. Or define some number of hours/CS folks dedicated to supporting it.

gcharest commented 5 years ago

@Zulban Yes, I've quickly added such a requirement re: 24/7 support.

Regarding the choice of platform, I need to take more time (I don't have enough...) to articulate the document but there's most likely a Ven diagram of needs here, depending on the level of classification.

I'm trying to present this as a clear(er) picture of the needs so that we can do an informed recommendation in terms of options analysis. I do believe that a single platform most likely is the most logical step forward. However, there's a (lack of) governance challenge where we are all different organizations and enforcing a single platform is not done lightly.

It may however be our recommendation if all the requirements point to it.

Zulban commented 5 years ago

I understand if it's tricky. I'm just glad to give my input. Great job so far. Do what you can :smile:

harsh commented 5 years ago

Is there even anything that would currently allow that without it becoming a nightmare to administer, my view would be two separate solutions something similar to GCCode for inner sourcing and a separate solution for Open source projects.

@CalvinRodo In addition to what @MikeNwin outlined here, as of January 8th, GitHub provides a "Unified SKU" so large organizations like the GC can have a private instance of GitHub Enterprise on premise or virtual private cloud (open to every employee) AND have multiple GitHub Enterprise SaaS organizations but only pay for unique users. A lot of organizations already have plans on GitHub SaaS so this would allow the GC to quickly connect the SaaS orgs with their On Premise server. Not only is this a huge cost savings but this allows the GC to decide where to InnerSource since there are unlimited public and private repositories in both cases. More sensitive code can be on Enterprise On premise and then use GitHub Connect to get access to an org on github.com and decide what to open source.

Humbly speaking, GitHub is the only vendor that provides a single platform for both InnerSource and seamless access to Open Source as well as the Open Source community. So yes @Zulban we can satisfy your requirement above AND GitHub offers 24/7 support.

@gcharest Happy to clarify any of the information above 😄

Zulban commented 5 years ago

I wish public servants could discuss the government of Canada requirements without hearing advertisements from Microsoft.

britthurley commented 5 years ago

My personal hope is that this doesn't result in a single product, but a service offering that can satisfy each use case.

Focusing on a single platform is a major risk to adoption, especially when a VCS platform must be interoperable with tools that support additional capabilities outside of just inner and open sourcing code.

gcharest commented 5 years ago

@zulban to be fair, we have this discussion in the open and I'd welcome any and all service providers to share their ideas.

I do not know everything and I know some of the very problems we're trying to solve have been addressed already.

gcharest commented 5 years ago

@britthurley I would totally support that. I have been delayed on working on this file but getting back to it now.

I'll put some new ideas for discussion shortly.

Zulban commented 5 years ago

@Zulban to be fair, we have this discussion in the open and I'd welcome any and all service providers to share their ideas.

I do not know everything and I know some of the very problems we're trying to solve have been addressed already.

What other service providers are part of this conversation?

gcharest commented 5 years ago

Not specifically around Git but we have many other private sector service providers.

GitLab is also aware of this work but they don't seem to have commented yet.

Le mar. 2 avr. 2019 00 h 15, Stuart Spence notifications@github.com a écrit :

@Zulban https://github.com/Zulban to be fair, we have this discussion in the open and I'd welcome any and all service providers to share their ideas.

I do not know everything and I know some of the very problems we're trying to solve have been addressed already.

What other service providers are part of this conversation?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/canada-ca/OS-Advisory_Conseil-SO/issues/101#issuecomment-478837853, or mute the thread https://github.com/notifications/unsubscribe-auth/ABnJ5QoyiRVTw29RlQsGronTy-YNrltgks5vctlhgaJpZM4Zp3wN .