alan-turing-institute / defoe

Code to analyse books and newspapers data using Apache Spark.
MIT License
17 stars 3 forks source link

Reduce number/overhead of Azure authentication calls #20

Open mikej888 opened 5 years ago

mikej888 commented 5 years ago

defoe/spark_utils.py open_stream creates and connects to a new instance of azure.storage.blob.BlobService for every "blob". This should be refactored so it's only done once per Azure container within which the blobs reside.