Configure application to use Cloud based storage

summary of my experiments with cloud storage using app service

1) simply put the file directly in the machine

app timing equiv to laptop, about 16s
can manually pull from HPCC into app disk from bash prompt using scp and one or our credentials eg. ssh key or push the other cloud storage and pull in from there
standard App service machines have 250gb disk - is this enough for the future, except for user-loaded networks?
e.g. not cloud storage at all

2) 'mount' azure files (expensive storage) to the app service and use standard path initial tests show that using 'mounted' cloud based storage

minimal changes to code were needed (just use a config variable for data folder)
app is much slower : > 100s for small runs
copy from Azure storage to HPC using the azcopy command on the HPCC, initiated on HPCC.

3) use blob storage and blob storage reading code

add new code to use Azure Python SDK, which downloads files from Blobs to local disk, and then reads them in as usual
minimally tested for timing, but it's too slow to read in data with every run (unless we accept > 5 minute run times)
it will only be viable if suport data is read into memory on app startup and re-used, based on this performance tip
move data from HPC to blob storage using azcopy command initiated on HPC

Conclusion : to use cloud storage cheaply means going to a 'slow' batch job model, unless the "backend" data remains < 250gb. If we allow users to upload their own networks, those should definitely be put into 'blob' storage. I will follow up with Chris.

krishnanlab / geneplexus_app

Configure application to use Cloud based storage #5