ONSdigital / dp-python-tools

Simple reusable python resources for digital publishing.
MIT License
1 stars 0 forks source link

initial s3 functions #4

Closed mikeAdamss closed 7 months ago

mikeAdamss commented 8 months ago

What is this

We want some simple s3 related functions for interacting with bucket on AWS.

We'll want to use the AWS boto3 python client and its s3 capabilities.

What to do

Important - the concept of a "directory" doesn't really exist in s3, all objects have a name and the path is part of that name. The trick here is to make our interactions with s3 work in a more directory like fashion.

We don't want a lot to begin with, just some simple functions that can do some universal things (that we can build on).

Something like:

/dpytools/s3/basic.py

The intention is that if we do extend this to a class, the class would just make use of any functions we write (and unit test) here.

Initially, we want the following functions:


def get_s3_object(object_name) -> <whatever this class is>:
    """
    Given a full name, i.e "/stuff/things/file.txt return the
    object from boto3.
    """
    ...

def get_s3_object_as_dict(object_name) -> dict:
    """
    The above but:
    - assert its a json file
    - read it iin
    - return it as dictionary
    """
    s3_object = get_s3_object()
    # TODO - stuff in notes above

def download_s3_object_to_local(object_name, path):
    """
    Download a given s3 object to a local path
    """
    ...

def upload_local_file_to_s3(file: Path, object_name):
    """
    Upload something we have locally to a place on s3

    i.e a csv or json metadara file.
    """

def upload_local_files_to_s3(files: List[Path], object_name):
    """
    Upload many something we have locally to a place on s3

    i.e a csv or json metadara file.
    """
    # Loop and use the above function ...but... check all
    # the local files actually exist first please.

Accetpance Critiera