Optimise pulling data from Minio

Find a more efficient method to pull data from Minio. The current way data is pulled from Minio uses the urllib python library (https://docs.python.org/3/library/urllib.html), which essentially downloads the entire data csv file by sending a request via a URL, which isn't the most efficient way of doing it.

Alternatives to look into:

S3 protocol for the SparkContext method textFile(). Due to way S3 is handled by Spark, this may prove troublesome to set up.
Use this python library for Minio interaction: https://github.com/minio/minio-py . This library is specifically made for interacting with Minio, thus it may prove more efficient.
Use the [AWS SDK](https://github.com/minio/minio/blob/master/AWS-SDK-GO.md

Update: we decided to use the Minio API https://github.com/benchflow/data-transformers/issues/56

benchflow / data-transformers

Optimise pulling data from Minio #3