Open joerg-halo opened 3 years ago
I think this statement could be highly problematic.
This raises at least the question what exactly is called a "download" in the terms of the WLA. Then there is the question what is meant by "before" i.e. is that once or every time or once per month? And in consequence, how this interacts with
This raises at least the question what exactly is called a "download" in the terms of the WLA.
I am not sure if I get this point. In my simple view, a download is everything when 'actual' data (not metadata) are transferred to a computer which is not part of the DB. Exceptions for visualization tools might be discussed, depending on the tool. A display of a quicklook might not require the acceptance of the data policy, more advanced features might require it.
Then there is the question what is meant by "before" i.e. is that once or every time or once per month?
"Every time" is meant. In the WLA I mentioned Aeronet as an example for this scheme: https://aeronet.gsfc.nasa.gov/
the meaning of “publicly available” data: "The HALO-WLA clarified, that this excludes every form of mandatory registration."
I don't get this obstacle. @d70-t, could you help me?
accessing subsets of datasets i.e. #51
The data policy could be displayed with the retrieval of each subset, couldn't it?
script based access #48
Yepp, I think that this is the main issue. However, I advocate that this shouldn't influence us, when dealing with the script based access. If this feature is about to become adequately implemented, the WLA has to discuss how to deal with this contradiction.
In my simple view, a download is everything when 'actual' data (not metadata) are transferred to a computer which is not part of the DB.
This raises the question about what is a "computer which is not part of the DB". If we have support to store datasets on machines which are not provided by DLR (or whoever runs the "core" of HALO-DB), are then all of these computers "part of the DB", or are they not?
-> probably we can't have any external data store if we want to enforce the rule.
"Every time" is meant.
This raises the question about what is "a time". One could be pedantic and refer to single IP-Packets, but that might be too picky. However one thing which we surely want to do is access subsets of the data (otherwise larger datasets are definitely out of reach). In that case, it is really common that a user / client application / etc.. creates many (could easily be 100s or 1000s) requests to different parts of the same dataset in order to create a single figure. Each of those requests look like the first request to the server / database and thus is per se not distinguishable from the second or any other request. If a message must be displayed "every time" then this would necessarily create a mess on the user's computer.
If we would force the user in stead to bundle all requests into a single big one, the whole thing would be really complicated to use any many (especially interactive) ways to work with the data would become impossible, as often one does not yet know which subsets of the data are needed before retrieving the first parts. (Think of downloading the time axis first to determine which data indices are within your study time, then request the rest only for the subsets in that time period).
"The HALO-WLA clarified, that this excludes every form of mandatory registration."
If we do not want to display a message "every time" (see above why this is a bad idea), but in stead only some times, then we basically have two options:
The data policy could be displayed with the retrieval of each subset, couldn't it?
As mentioned above, it will become very common to access 100s or 1000s of subsets per day per user, I don't want to imagine how much time I'll be spending clicking pop up windows away.
script based access #48
Yepp, I think that this is the main issue. However, I advocate that this shouldn't influence us, when dealing with the script based access. If this feature is about to become adequately implemented, the WLA has to discuss how to deal with this contradiction.
See above. Also we have plenty of use cases which will likely require some form of script based access, e.g.: #56, #53 (at least the reverse), #51, #50, #48, #47, #43 ?, #37, #20, #17?. If we should have a design which decouples the data storage from the user interface (which is highly desirable), then even our own user interface will be "script based access".
In the WLA I mentioned Aeronet as an example for this scheme: https://aeronet.gsfc.nasa.gov/
They are not enforcing the display. A user has several possibilities to obtain the data:
-> The Aeronet site has a user interface which makes it easy for a user to stumble across the data policy, but it does not enforce it for all downloads, because this would make very useful use cases impossible.
I think this all boils down to an analysis of costs versus benefits of the enforcement.
Alright, thank you @d70-t for the comprehensive answer. My conclusion is the following:
When data are downloaded via the web-surface, the data policy has to be displayed. This is the actual use case which the WLA had in mind, when making the respective decision.
I understand now, that a number of use case would be inhibited by a strong interpretation of the WLA decision. However, this should not influence our ways of thinking about solutions for the HALO-DB. Upcoming contradictions can be discussed within the WLA.
The WLA decided that before the start of a download, users have to be made aware of the HALO Data Policy and the Data Protocol of the respective mission. This should happen via a popup window or popup website.