sebbacon commented 2 years ago

Every so often, someone comes along who wants to have a go at installing OpenSAFELY.

At the moment, I tell them we don't have resources to support them directly, but they should feel free to have a go on their own. But I do still need to give them an overview to orient them.

I suggest iterating the readme in this repo, although we could alternatively make a new page in our documentation.

The target audience is "highly motivated and sufficiently skilled software engineer who has capacity to attempt getting their own install up and running as a proof of concept".

Here's text I've sent people previously, by email, as a starting point - in response they've asked if we have any regular meetings this group member could join or first GitHub issues that new comers could address. Not sure what we could say here at this stage; I guess some text encouraging people to write questions in our Discussions board? (Should we set up a Discussions just for system integrators?)

You would want to start by deploying a "job runner" within the secure environment. In the simplest configuration, this would be configured to execute docker containers and put their output on a locally accessible disk. You would configure it to poll our "job server" for work requested by end users (you can use the one we provision - you'd need to get in touch for us to configure at our end).

You would have to set up a secure network with a "proxy service" running in a DMZ, to allow safe access to github. This network should also allow acess to the job server (jobs.opensafely.org). To support output releasing, you would need to install our "osrelease" tool and another proxy for that, in the DMZ.

Finally, to mediate access to your database, you would need to implement a backend (example for TPP here) (and perhaps database connector) for the "cohortextractor" ETL tool that is used in the "job runner" toolchain; however, I'd recommend not doing this part right now, as in about 3 months we will be releasing a refactored, and much simpler replacement for cohorextractor, called databuilder (which is already minimally functional)

The environments to which we currently install ourselves have all the above scripted in deploy scripts; we would recommend deploying to Ubuntu, which is what the EMIS deploy scripts do. You'll note there are some constraints in that environment; we are upgrading our TPP environment to be Ubuntu-based soon, so there will be more generic Ubuntu deploy scripts coming out when that happens.

I've then linked to our architecture diagrams

benbc commented 2 years ago

@sebbacon There is some of this content already here.

StevenMaude commented 2 years ago

Assorted thoughts :stew:

:speaking_head: On communication

in response they've asked if we have any regular meetings this group member could join

Some other big open source projects have semi-regular (monthly/quarterly) community presentation/Q&A/discussion sessions.

If there were enough people wanting to integrate with their own organisation's data, that multiplexing is also one approach to minimising repetition of similar conversations.

or first GitHub issues that new comers could address.

What's the context here? Because they're offering to contribute and want to help us improve?

Have we got defined policies on external contributions (licensing/code audit/security risk)?
- Could we explain these in the documentation or in a CONTRIBUTING.md in an organisation .github repository that's then displayed by default?
If yes to 1, would it be worthwhile core developers on individual projects reviewing open issues and tagging easy fixes that we just haven't got round to doing?

:scroll: On content

A documentation page for "system integrators" [^1], even a minimal one might help, even if it's merely a link to this repository, the Google doc above — which looks pretty good — or just:

We're still working out a simple development approach for interested third-parties. Please contact us for more information".

If I was an external developer, that would be the first place I would look. Otherwise, you have to anticipate what routes someone might take to get to this repository and make them easier to discover.
we are upgrading our TPP environment to be Ubuntu-based soon

This stock response might need updating!
Maybe this entire situation is simpler if there's a complete end-to-end test developed? A fully working and up-to-date example for inspection and tinkering would be very useful.

[^1]: Or equivalent SEO-friendly term that is used in the healthcare domain.

opensafely-core / backend-server

Iterate our intro documentation #61

Assorted thoughts :stew:

:speaking_head: On communication

:scroll: On content