ga4gh-discovery / ga4gh-case-discovery

A framework for searching genomic data sharing services
Apache License 2.0
8 stars 5 forks source link

Should we use new components or existing composite formats (or both)? #24

Closed Relequestual closed 6 years ago

Relequestual commented 6 years ago

Initially, Harindra's vision for the search API was to use existing "record formats" (like MME or Phenopackets).

During the Toronto meeting this year, I explained my Search API proposal, where the component based system would be used in stead of using these record formats, as otherwise you create a huge interoperability nightmare.

When explaining this to Harindra, Marc, and others in the room for the Discovery Workstream Search API sessions, were in agreement. It was requested that I documented my proposal to make it clear, which I've done.

It wasn't explicit that the intent was to no longer use existing formats, but implict by their exclusion. The pricipal was to break appart any exisitng formats into components, allwing for easy transformation for those that have existing code which compiles a json result.

@harindra-a If you're in agreement, and we are both on the same page as this, I'll make a PR to remove records and associated references in the documents remove the mention of existing composite formats as components, and close a number of associated issues. In the meantime, I'll flag those issues as blocked by this one.

harindra-a commented 6 years ago

hey Ben, yeah, this is an important point, let's discuss at next SearchAPI meeting and decide/vote.

Personally, I think having component based return types are fine, and not orthogonal to the design now. For example, the components you speak about could simply be their own ontology they return and could exist alongside say MME, or PhenoPacket ontologies.

The reason I wanted to start with MME was that they our first/strongest use-case and I wanted the transition to be with minimum changes. Long term, I believe the movement in GA4GH is everyone to adopt PhenoPackets as a type.

That all said, your component based system should work too I believe. I will make sure the implementors are in the next meeting, so we can decide this. But sure, at the very least we can use component return types as unique return type ontologies at the very least

Relequestual commented 6 years ago

I understand that minimal changes is desierable, and it may seem that simply to use the existing MME format is a viable approach. However, the MME format was built with a different use case in mind, and if followed strictly, will not work within a search setting.

If we use parts of the MME format, and extract them into components, we still fulfil the requirement of minimal changes, and yet have a format that is purpose built for the task, with the future in mind.

I'm really concerned that if we include these existing formats as is, without breaking them into their constituent componets, none of this will be able to move forward in a meaningful way. This was the whole point of my proposal, and long debate during the Toronto meeting. It was pretty clear, to me, that others agreed with this point of view.

Did you hear something different?

Relequestual commented 6 years ago

I tried to outline this clearly in my proposal, that it is exclusive of using those formats as is. Have you had time to read it in full?

harindra-a commented 6 years ago

tbh, I am not really opposed to your component based return types. I think most of DWS stakeholders and all the first implementors who have deep experience in this space have read your proposal at this point, so please bring it up at next week's SearchAPI meeting for discussion; all/most stakeholders should be there and it can be decided.

Relequestual commented 6 years ago

That's fine, but do you feel you understand the points I'm making? Can you follow the logic? If not, you may not be the only one, and I may be missing something from my explanation. I want to avoid any ambiguities.

I'm not asking these questions rhetorically or to be difficult, I'm asking them so we avoid misunderstandings. Communication is hard, I am not the best.

harindra-a commented 6 years ago

personally, I think what you suggest is reasonable. At the meeting, you should be able to get direct feedback/thoughts from everyone.

Relequestual commented 6 years ago

OK, great. For record, I don't expect your endoresment to be a rubber stamp. As you said, the group need to be informed by discussion and make a call.

@rishidev Can you make a note to have a clear action on this in the minutes for the next meeting please? =]

Relequestual commented 6 years ago

I modified the title to better reflect the question.

I had missed something from my proposal, that requires the use of a records key in the response at the root level.

See the response detail section

Each record is an object, which will contain a number of components, and possibly also metadata about the record. As such, it makes sense for the objects of the records array to have a keys of components and meta.

I need to hurry up and make some schemas... it should make all of this a LOT clearer.

Relequestual commented 6 years ago

Consensus on call today: Colin and myself agree that specifying components only make the API structure more flexible.