Open d70-t opened 3 years ago
I put the "big" label, because I understand that therefore the HALO database needs to be a primary database. Is this correct? The issue regarding the primary database is already raised in #50 and #18.
Yes, this use case requires that primary data and metadata are managed by the data base.
I disagree in this point. The system which runs the analysis could well be at a completely different place than the data itself. The only requirement is that the computing system is able to reference data from the database and is able to access data via a sufficiently fast network connection (which might be the internet). It might be beneficial if the data is provided in a form which allows for efficient subsetting.
Okay, I get it now - thanks. So, I remove the "big" label, because "reference data from the database" should be covered by #32 or #20 and "able to access data" by #48. Right?
Yes, probably this is at least not-so-big and probably the datacenter which provides the online compute service will be a different hosting organization than where the HALO-DB is located. The big parts on HALO-DB is to think about this use case while covering the other cases you have mentioned.
As a user of large datasets with limited internet connectivity, I want to run my data analyses online ("in the cloud") without the need to download files to my computer. An example for this possibility would be the CDS Toolbox