Portfolio B - Architecture

vladbucur2000 commented 3 years ago

Identify the requirements and quality attributes derived from interactions with your client that are most likely to have an influence on your system architecture. For example, if your client requires the system to be made available via the web, this will clearly have major implications for your chosen architecture. At a lower level, if your application is expected to push dynamic updates out to your clients, this will also have an impact on your system. A whole range of other requirements-level factors such as system interoperability, data storage/retrieval, security, hardware, interaction devices etc. may also impact your architecture. Once you have explicitly documented these primary architectural drivers, present a high-level architecture diagram of your overall system structure. Include in your textual description a justification for your architectural choices.

vladbucur2000 commented 3 years ago

Portofolio A feedback: This section is good. As you know more about your implementation, you can add more specificity about how you will implement certain technologies.

too modify/add: new technologies

georgeedward2000 commented 3 years ago

High level architecture diagram

INTRODUCTION ARCHITECTURE

We propose the design of a client application which will:

Generate Data in csv standard
Anonymise data
Use AMQP server to filter the flux of data
Integrate and Centralise Data
Create a safe, fast, and efficient connection with web services

Generate Data

SyntheaTM

We use open source SyntheaTM Patient Generator to generate as-live data to simulate the regional healthcare landscape. Specifically, the simulators will generate data from a specific region:

Simulate more than one healthcare provider category (e.g. Acute hospital, primary care, 111)
Patients

Synthetic patients can be simulated with models of disease progression and corresponding standards of care to produce risk-free realistic synthetic health care records at scale.

The framework for the synthetic data generation process utilized by Synthea is based on the use of PARSER, the Publicly Available Data Approach to the Realistic Synthetic EHR. The PADARSER framework, unlike EMERGE and medGAN, assumes that access to the real EHR is impossible or undesirable, relying instead on publicly available datasets to populate the synthetic EHR. Figure 1 presents the PADARSER framework.

HL7 FHIR Fast Healthcare Interoperability Resources (FHIR, pronounced "fire") is a standard describing data formats and elements (known as "resources") and an application programming interface(API) for exchanging electronic health records(EHR). The standard was created by the Health Level Seven International (HL7) health-care standards organization.

In our software:

Originally, Synthea generates US medical data (e.g. names, postcodes, cities, regions etc.). So, in order to generate data based on UK medical data distributions, we worked on the open source project and redeployed the new UK built Synthea generator.

DATA INTEGRATION AND CENTRALISATION

We propose to use MIRTH NextGen Connect Data Centralisation and Integration Engine.

MIRTH NextGen Connect is a cross-platform interface engine used in the healthcare industry that enables the management of information using bi-directional sending of many types of messages. The primary use of this interface engine is in healthcare.

Benefits of using Mirth are:

• It is built for Healthcare. • It has purpose-built solution for csv and FHIR (data translators). • It supports Data Acquisition (large amounts of data from multiple sources) - AMQP server.

MIrth is a desktop java based application which have an intuitive UI (User Interface). It offers the possibility to work with multiple translators (from csv to FHIR), each one representing a specific channel.

In our software:

In our implementation, every Mirth channel:

Connects to a AMQP server queue using a javascript script to take the incoming csv based data.
Splits the csv file in multiple records
Translates each record in FHIR resource
Using the previous obtained token builds the HTTP POST request to API
Send the request

Data Transfer Protocols

Message broker technology (AMQP) is an intermediary computer program module that translates a message from the formal messaging protocol of the sender to the formal messaging protocol of the receiver. Message brokers are elements in telecommunication or computer networks where software applications communicate by exchanging formally defined messages.

HTTPS is used for secure communication over a computer network, and is widely used on the Internet. In HTTPS, the communication protocol is encrypted using Transport Layer Security (TLS) or, formerly, Secure Sockets Layer (SSL).

Data Ingestion

The system will authenticate and create a RESTful endpoint for HL7 FHIR messages.

HL7 FHIR endpoint describes the technical details of a location that can be connected to for the delivery/retrieval of information. Sufficient information is required to ensure that a connection can be made securely, and appropriate data transmitted as defined by the endpoint owner.

RESTful API is an architectural style for an application program interface (API) that uses HTTP requests to access and use data. That data can be used by using the CRUD approach: create, read, update, and delete.

Amazon API Gateway is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale. Using API Gateway, RESTful APIs enables real-time two-way communication applications. API Gateway supports containerized and serverless workloads, as well as web applications.

RabbitMQ is a messaging system that uses AMQP 0.9.1 as the basis for a set of standards controlling the entire message passing process.

Benefits of RabbitMQ:

Delivery and order guarantee: The messages have been sent to a consumer in the same order in which they were created.
Redundancy: The queues persist the messages until they are processed completely.
Decoupling: Any third party system can consume the messages and interact with them, so you want the messages to be processed by someone who is not the actor who created the message, without any problems. This generates us a benefit, which is that it can be reused for many applications.
Scalability: we can have an application server dedicated to the processes and the other servers for browsing the web.

In our software:

In the "Architecture Diagram", the multiple arrows that go into RabbitMQ designates the different queues built on top of AMQP protocol:

Resource queues. Used to filter the csv data. E.g. "somerset_patient": A queue which servers as the way of transferring and storing the generated "patients.csv" file. This file contains data based on the Somerset region.
Token queue. Used for transferring the generated API Cognito token to the integration engine (i.e. Mirth NextGen)

Classically, RabbitMQ server is working in background, with no need of user interaction. The deployment of the server is based on CLI (Command Line Interface) interaction, but there are alternatives. One of these is using a specific Plug-in which offers the opportunity to work with a UI (User Interface), which appears as a web application. We used this UI in order to analyse the movement of messages and to test its capabilities as well as new features (e.g. application scheduler).

Authentication

OAuth 2.0 is the industry-standard protocol for authorisation. OAuth 2.0 focuses on client developer simplicity while providing specific authorisation flows for web applications, desktop applications, mobile phones, and living room devices.

In our software:

Authentication will be secured by using Amazon Cognito. The system will use a secure Token to access the API Gateway to create a safe and recognized connection with the HealthCare Lake/database infrastructure. The API Gateway will run a RESTful API and a HL7 FHIR message for ingestion into data lake. Amazon Cognito will verify the token and continue with the data transfer.

vladbucur2000 commented 3 years ago

reviewed

spe-uob / 2020-Healthcare-Data-Simulators