kernelci / kernelci-api

KernelCI API - Database - Pub/Sub
GNU Lesser General Public License v2.1
9 stars 17 forks source link

Evaluate chosen storage solution #263

Open gctucker opened 1 year ago

gctucker commented 1 year ago

Once the discussion around various storage solutions leads to a potential one to choose, or a shortlist, then it needs to be evaluated.

gctucker commented 1 year ago

The current status quo is we have:

Basically, the Azure Files solution means better scalability, quotas, backups and less manual maintenance while the SSH / Nginx solution means lower cost, self-contained and easier to setup manually. As the only API requirement for storage is that the files should be available with a public URL, we can change storage solution virtually at any time. So for now, the plan seems to be to use Azure Files for the Early Access deployment to evaluate the costs, features etc. (we might add a proxy and optimise if we want to keep using it...) while also have the default SSH / Nginx solution for developers to use with docker-compose and as a fallback solution if we hit some issues with Azure Files (storage limits, costs, bandwidth).

nuclearcat commented 1 year ago

I suggest to implement as cloud-based solution - S3-compatible, as it is de-factor cloud object storage standard and even it is initially implemented by Amazon, this days considered as vendor independent. Microsoft own words on that:

has become the de facto standard interface for almost all storage providers. https://devblogs.microsoft.com/cse/2016/05/22/access-azure-blob-storage-from-your-apps-using-s3-api/

This way developers can spin-up self-hosted S3 storage (MinIO,Ceph,Zenko,Riak S2,Triton,LeoFS,HyperStore) or use almost any cloud provider, as many provide compatibility layers or native S3 object storage.

In case of Azure Files this creates vendor dependence and exclude option of easy self-hosted solution.

gctucker commented 1 year ago

Having the option of using Azure Files doesn't exclude using S3-compatible storage solutions. Thanks for the input, I'll add a comment on the main issue about storage solutions https://github.com/kernelci/kernelci-api/issues/9.

gctucker commented 1 year ago

In case of Azure Files this creates vendor dependence and exclude option of easy self-hosted solution.

Yes Azure Files are of course specific to Azure. But it doesn't exclude also using other storage solutions. We could have any kind of URLs in the database, it doesn't matter if it points to Azure Files, some static VM or any other Cloud-based storage as long as it's HTTP(s). And we can have any upload method in the code as long as we have credentials installed for it. Azure Files is the quickest and easiest way to get started with Cloud storage using the resources we currently have available.

gctucker commented 1 year ago

OK I've added https://github.com/kernelci/kernelci-api/issues/310 to cover the topic of adding support for S3 storage as well. I don't think it's a blocker, we already have Azure Files and SSH / Nginx which can both be used for Early Access. S3 is of course a type of storage that we ultimately need to support if possible, and this issue is not just about Early Access.

gctucker commented 8 months ago

Assigning to @nuclearcat