NBISweden / LocalEGA

Please go to to https://github.com/EGA-archive/LocalEGA instead
Apache License 2.0
4 stars 1 forks source link

Testing LocalEGA deployment in Openshift multi tenant environment #334

Closed blankdots closed 6 years ago

blankdots commented 6 years ago

Description

Openshift multi tenant environment (used as a testing environment by one of the NeIC partners) presents some challenges due to some restrictions of the Openshift platform (By default, OKD runs containers using an arbitrarily assigned user ID.) and restrictions to only use http/https to connect to the shared environment (affecting sftp connection).

DoD (Definition of Done)

A file is ingested into the vault (S3 based) and verified.

Testing

Peer review for PR.

blankdots commented 6 years ago

Due to sftp issues/restrictions connecting to the testing environment and also from one pod to another we chose to an alternative sshv2 protocol implementation http://www.paramiko.org/.

As a basis for the pod we used an image docker pull blankdots/docker-browsepy:ftp

The testing script is below.

import paramiko
import os
import pika
import secrets
from hashlib import md5
import json
import string
import uuid
import logging
from legacryptor.crypt4gh import encrypt
import pgpy
import argparse

FORMAT = '[%(asctime)s][%(name)s][%(process)d %(processName)s][%(levelname)-8s] (L:%(lineno)s) %(funcName)s: %(message)s'
logging.basicConfig(format=FORMAT, datefmt='%Y-%m-%d %H:%M:%S')
LOG = logging.getLogger(__name__)
LOG.setLevel(logging.INFO)

def sftp_upload(hostname, user, file_path, key_path, key_pass='password', port=2222):
    """SFTP Client file upload."""
    try:
        k = paramiko.RSAKey.from_private_key_file(key_path, password=key_pass)
        transport = paramiko.Transport((hostname, port))
        transport.connect(username=user, pkey=k)
        LOG.info(f'sftp connected to {hostname}:{port} with {user}')
        sftp = paramiko.SFTPClient.from_transport(transport)
        filename, _ = os.path.splitext(file_path)
        sftp.put(file_path, f'{filename}.c4ga')
        LOG.info(f'file uploaded {filename}.c4ga')
    except Exception as e:
        LOG.error(f'Something went wrong {e}')
        raise e
    finally:
        LOG.debug('sftp done')
        transport.close()

def submit_cega(connection, user, file_path, c4ga_md5, file_md5=None):
    """Submit message to CEGA along with."""
    stableID = ''.join(secrets.choice(string.digits) for i in range(16))
    message = {'user': user, 'filepath': file_path, 'stable_id': f'EGA_{stableID}'}
    if c4ga_md5:
        message['encrypted_integrity'] = {'checksum': c4ga_md5, 'algorithm': 'md5'}
    if file_md5:
        message['unencrypted_integrity'] = {'checksum': file_md5, 'algorithm': 'md5'}

    try:
        parameters = pika.URLParameters(connection)
        connection = pika.BlockingConnection(parameters)
        channel = connection.channel()
        channel.basic_publish(exchange='localega.v1', routing_key='files',
                              body=json.dumps(message),
                              properties=pika.BasicProperties(correlation_id=str(uuid.uuid4()),
                                                              content_type='application/json',
                                                              delivery_mode=2))

        connection.close()
        LOG.info('Message published to CentralEGA')
    except Exception as e:
        LOG.error(f'Something went wrong {e}')
        raise e

def encrypt_file(file_path, pubkey):
    """Encrypt file and extract its md5."""
    file_size = os.path.getsize(file_path)
    filename, _ = os.path.splitext(file_path)
    output_base = os.path.basename(filename)
    c4ga_md5 = None
    output_file = os.path.expanduser(f'{output_base}.c4ga')

    try:
        encrypt(pubkey, open(file_path, 'rb'), file_size, open(f'{output_base}.c4ga', 'wb'))
        with open(output_file, 'rb') as read_file:
            c4ga_md5 = md5(read_file.read()).hexdigest()
        LOG.info(f'File {output_base}.c4ga is the encrypted file with md5: {c4ga_md5}.')
    except Exception as e:
        LOG.error(f'Something went wrong {e}')
        raise e
    return (output_file, c4ga_md5)

def main():
    """Do the sparkles and fireworks."""
    parser = argparse.ArgumentParser(description="Encrypting, uploading to inbo and sending message to CEGA.")

    parser.add_argument('input', help='Input file to be encrypted.')
    parser.add_argument('--u', help='Username to identify the elixir.', default='ega-box-999')
    parser.add_argument('--uk', help='User secret private RSA key.', default='/files/user.key')
    parser.add_argument('--pk', help='Public key file to encrypt file.', default='/files/key.1.pub')
    parser.add_argument('--inbox', help='Inbox address, or service name', default='inbox.lega.svc')
    parser.add_argument('--cm', help='CEGA MQ broker address')

    args = parser.parse_args()

    used_file = os.path.expanduser(args.input)
    key_pk = os.path.expanduser(args.uk)
    pub_key, _ = pgpy.PGPKey.from_file(os.path.expanduser(args.pk))

    inbox_host = args.inbox
    test_user = args.u
    connection = args.cm if args.cm else os.environ.get('CEGA_MQ', None)
    test_file, c4ga_md5 = encrypt_file(used_file, pub_key)
    if c4ga_md5:
        sftp_upload(inbox_host, test_user, test_file, key_pk)
        submit_cega(connection, test_user, test_file, c4ga_md5)
        LOG.info('Should be all!')

if __name__ == '__main__':
    main()

The script looks like the makefile we use for testing?

It was inspired by that, but it does not duplicate the work as that script does not work in this specific environment. It can be extended for other scenarios, doing a web interface for submission, adding a download option and connecting it to the data out part, will let one's imagination run loose on this one.

Can we see a demo?

Here you go: https://www.youtube.com/watch?v=-RPF9AigP6Y