ONSdigital / dp-python-tools

Simple reusable python resources for digital publishing.
MIT License
1 stars 0 forks source link

decompress from s3 #3

Closed mikeAdamss closed 6 months ago

mikeAdamss commented 8 months ago

What is this

We need to the ability to decompress a tar file from an s3 bucket to a specific directory path.

If the directory does not exist, the function should create the directory.

If the directory already exists, the function should raise an error.

What to do

We'll want a function along the lines of:


def decompress_s3_tar(s3_url: str, directory: Optional[Path]):
    """
    Given a url to an s3 object that is a tar file, decompress it
    to the provided directory path.
    """
    # Make sure it actually is a tar file.
    # Decompress all the files to the directory specified.
    # If the directory does not exist, create it
    # If it does exist, raise an error
    # If no path  is provided assume the current working directory
    ...

Just create a bucket in bleed and stick a tar file in it and develop to that.

For testing, s3 is a very universal reasource, there's highly likely to be some support if not outright documentation on how to test against it.

Acceptance Criteria