plantbreeding / BrAPI

Repository for version control of the BrAPI specifications
https://brapi.org
MIT License
57 stars 32 forks source link

distinct marker types #103

Open GuilhemSempere opened 7 years ago

GuilhemSempere commented 7 years ago

I have a student currently working on an interface that will allow browsing BrAPI genotyping data and we would like to allow filtering by marker type. With the current specs, we see no other option than getting the entire (possible huge) list of markers, and building the distinct list on the client side. It would therefore be extremely more convenient to be able to grab in a single request the list of marker types in a given dataset.

lukasmueller commented 7 years ago

Wouldn't all markers in a markerprofile mostly be the same type (GBS etc). Maybe I am not sure what you mean by marker type?

GuilhemSempere commented 7 years ago

As far as I know when you launch variant calling on sequencing data you do get different kinds of variants. At least SNPs and INDELs. Sometimes CNVs or structural variants (often marked as MIXED).

GuilhemSempere commented 6 years ago

May I suggest something along these lines? GET /brapi/v1/marker-types-search{?markerDbIds}{?name}{?matchMethod}{?pageSize}{?page}

BrapiCoordinatorSelby commented 6 years ago

Would it be enough to define a fixed set marker types? or is it important for each system to be able to define their own marker types?

Just from this thread: GBS SNP INDEL CNV MIXED

Any more?

GuilhemSempere commented 6 years ago

From my point of view defining a fixed set of marker types is feasible but is a separate matter. The need I was trying to express was for a way of knowing which marker types are actually present in a given dataset.

BrapiCoordinatorSelby commented 6 years ago

Ok I think see what you are trying to do. But how do you want to reference the data set to get marker types for? Your previous comment suggests that you want to provide a list of markerDbIds and get back the distinct list of marker types. But for large list of markers, this seems only slightly better than doing the filtering by type on the client side right?

Someone from the GOBii team has suggested that we need the concept of a Marker Group so that we can reference a group of markers with a single ID. Would this make more sense for your use case? Something like : GET /makertypes{?markerGroupDbId}

GuilhemSempere commented 6 years ago

My ultimate goal is to make Beegmac able to query (at once) the server saying for example "I want all markers of type SNP or INDEL in chromosomes 1 and 2". But for that It needs to know which marker types are actually represented.

Sorry my suggestion on 5 Feb wasn't very accurate. In my very own case the only filtering parameter I would need to pass would be a study ID, assuming the database is able to tell which marker types are present in a given genotyping study (which I think is a pretty common situation). But I would appreciate any comments from other people working with genotyping data so we can think of a generic enough solution.

Regarding marker groups it all depends how they can be defined. If we find a way of getting a markerGroupDbId that refers to all markers used in a study then the call you suggest could solve the problem indeed, although to my eyes this creates an indirection that needs to be justified.

BrapiCoordinatorSelby commented 6 years ago

ok thanks for the clarification, I will talk to some other (GOBii, Germinate, etc) and see what I can come up with. Study Id makes sense in your case, but makes me a little nervous because there is no direct link between study and marker defined in BrAPI right now. I think there should be, but it hasn't been explicitly defined yet.