Closed MKonopkoELIXIR closed 2 years ago
In your mind does retrieval also include access?
Not access in the "data access" sense, @GiselleMarie
This could also be the wrong word. https://www.techopedia.com/definition/30140/data-retrieval Maybe Data Extraction is better? https://www.techopedia.com/definition/25328/data-extraction
Emailed Dylan Spalding
Per @malloryfreeberg
Yes, the main category of interfaces are user <-> resource (e.g. to support data submission, data discovery, and data access and/or retrieval). For FEGA, we also have the notion of interfaces between nodes, or resource <-> resource. This is to exchange non-personal metadata to support queries across the FEGA network. Depending on how the resource is established, there also might be within resource interfaces between institutions in the same node (e.g. between the institutions that make up to German Human Genome-phenome Archive, GHGA). I'm not sure how important it is to capture these between/within resource interfaces in your model.
Need to determine whether these FEGA-related needs align 1:1 with ELIXIR Node needs. Also, if the needs are varied, a single indicator may be inappropriate.
Per Peter Maccallum: Distinction between read only interface and data interaction/upload interfaces Read only interface: query Admin read/write interface: submission
Suggests splitting this into two as above.
Started to make the change and realised there are three options.
Sent a message to Peter and sent the following email to Dylan (CC Tommi):
Hi Dylan,
I’ve made a bit of progress here by speaking to a variety of people, but I have again hit a bit of a wall and everyone says you’re the guy with the knowledge.
Peter Maccallum did explain what “interfaces” would cover as a concept and pointed out that I need to recognise the difference between read-only and read/write interfaces. My questions for you are thus:
- Does “Storage and Interfaces” cover all APIs?
- If not, do all of the core functionalities have their own APIs fall under their own functionalities (i.e. data reception APIs, data discovery APIs, data access APIs)? If they all do fall under their own functionalities, what APIs/interfaces are specifically tied to storage?
- Or is it that this section deals only with interfaces for the data once it is in storage, so data reception (aka read/write APIs) go with data reception, but discovery and access APIs (aka read-only) fit under storage and interfaces?
I’m copying in Tommi since he wrote the scoping paper that this all hinges on. I know he’s away for a bit, bug guidance whenever someone can point me in the right direction would be very helpful.
Email from Dylan:
This slide: https://docs.google.com/presentation/d/1rlwu5wRjZqkvkEGAxEDTjT3mvV-0uzRHCiun1qOkeQo/edit may help.
At a high level the storage and interfaces deals with the 'FEGA' like functionality - i.e putting the files onto secure storage, tracking their location, versioning them, deleting them if required, maintaining the metadata about the files (how they were produced, which individual(s) information is included in the files, data access requirements for the files etc. In effect this functionality underpins all other functionalities - e.g. data discovery uses (for example Beacon or the data portal) to determine which data is stored for what data use, REMS provides data access and management tools, and data reception can be both the movement of data if required (e.g htsget) and the curation processes (checking data conforms to the data model for example) both to (and in terms of htsget, from) the 'FEGA' like instance providing the storage and interfaces.
So in answer to your questions,: a) no, b) yes, storage and interfaces have many internal API's, but also PUT a file or metadata about a file, UPDATE a file or metadata about a file, GET a file or metadata about a file, DELETE a file and associated metadata about a file. c) The other functionalities have APIs that sit on top of these to provide additional functionality. For example in the case of data access and management, storage and interfaces know the file, access restrictions, etc. but not who has access to the file, this is held by the REMS instance(s). In the case of Beacon, Beacon can give some metadata about the file itself, but (possibly) the phenotypic data may be held within the Beacon instance in a format for querying, while the file and associated metadata (including phenotypic data if necessary) is held via the storage and interfaces functionality.
Need a slightly better understanding of this before I make any changes. Reached out to Peter for assistance.
Had a conversation with Dylan and he outlined the topics for the APIs as follows (see below for required changes):
Made updates per above.
When "interfaces" is referred to in Storage and Interfaces, is that submission and retrieval? @GiselleMarie