ohsu-comp-bio / euler

Authentication (authN) and high-level Authorization (authZ) for BMEG, Dirac and Search. Includes Swift object store.
MIT License
0 stars 0 forks source link

architecture-on-a-page #9

Open bwalsh opened 7 years ago

bwalsh commented 7 years ago

Overview

The intent of this document is to provide a high level scope, define a requirements-stack, and illustrate an architectural vision for the effort. The core practices we followed were:

'portal-proxy'

Functionality

Order Description Comment
1 Implement project security Missing feature necessary for workflows and cohort discovery (currently mocked within the UI. target ohsu ldap for authorization and authentication)
2 Load all of OICR Currently we've loaded only AML projects
3 release ohsu.dcc demo: 1)branded ohsu.dcc portal, 2) ohsu.ldap & user roles 3) refreshed data[baml, icgc] pancreatic data is not in scope

from roadmap

Deployment

euler

Dataflows

image

Keystone data flows

Conceptual sequence diagram. All services [swift, compute, image, and euler] work the same way. Keystone is to openstack as the mitre service is to CCC.
image

Dependencies

'add-a-file'

Functionality

Order Description Comment
4 Implement lightweight add a file Missing feature necessary for workflows (includes ccc_client. limited to repository functionality. release is out of scope)
5 Implement OHSU data store While we have demonstrated loading and searching OHSU data, we have not demonstrated access to the BAML or other OHSU data (eg. downloading, workflow, BAM Stats,etc.) (aka object store)

Deployment

euler-sprint2

Dataflows

image

Dependencies

Current work

Deployment

euler-sprint3

Functionality

Order Description Comment
6 Common ETL with BMEG Integrate event handling. Leverage same infrastructure (OHSU Kafka)
7 Load additional OHSU projects Other projects (from CGD?) should be loaded to complement BAML

Dependencies

Backlog

euler-backlog

Functionality

The stakeholders and sponsors have identified inter-institutional collaboration as a goal. These features have not been prioritized and approved, but are captured here to provide roadmap.

Order Description Comment
8 Federate dcc api Implement federation layer above dcc-server
9 Non OHSU sites Support for separately administered keystone instances
10 Cloud datasets Capture meta data about cloud datasets and workflows

un-groomed features

bwalsh commented 7 years ago

@mayfielg @k1643 @kellrott Please review this draft document in the context of EUL-36

grmayfie commented 7 years ago

First architecture diagram: looks very good, except we're missing an explicit definition of the ports each piece is using to communicate with the rest of the setup. In EUL-36, we said we wanted to describe those definitively as well. It's helpful to document what ports need to be available externally on which floating IPs, since ITG needs to handle that for us. What are these diagrams drawn in? I'd be happy to make that addition, if I can edit the documents the diagrams came from.

First data-flow diagram: from everything I remember about the dcc-portal-server, it never queries mongo. It only queries elastic, and mongo is only ever used as a resource during the release pipeline. If I'm correct, then this diagram is a bit misleading. Am I missing some detail of the server that does use mongo directly? I'm also a bit unclear on what 'config' in the openstack, keystone box is referring to.

I'd life to clarify the 'future sprints' section a bit. The separation from one dotted box to two from the first diagram to the second refers to moving from using the proof-of-concept instances of keystone and swift to the ones already installed in exastack and ITG, correct?

This backlog appears to deal solely with federation (of authentication, cloud storage, dcc-server instances). It would be clearer to say that explicitly as the over-arching epic of the ungroomed backlog.

More specifically, Are keystone-with-shibboleth and keystone federation different options for addressing authentication of non-OHSU personnel? Or are they supposed to work in conjunction? Or is that unknown at this time and an issue to address when we get to that stage? Also, I believe "s3, google bucket webhooks," is referring to the issue of federation across multiple physical locations, and thus multiple cloud storage sites. At this time, is swift capable of federation across swift instances already, or no? It might be intelligent to include that as a story first, before we start to deal with inclusion of other softwares.

bwalsh commented 7 years ago

@mayfielg Thanks for comments. Please check to ensure that I've addressed them.

it never queries mongo

removed mongo from dataflows

I'd like to clarify the 'future sprints' section a bit.

added clarification

This backlog appears to deal solely with federation

added features section

grmayfie commented 7 years ago

I think the only remaining question is the bit regarding what's proposed for shibboleth and keystone federation.

Are keystone-with-shibboleth and keystone federation different options for addressing authentication of non-OHSU personnel or are they supposed to work in conjunction? Or is that unknown at this time and an issue to address when we get to that stage?

bwalsh commented 7 years ago

Agreed. It is an evolving architecture. In the next sprint planning we can discuss w/ Adam M + Kyle + Paul H.

grmayfie commented 7 years ago

+1