plantbreeding / BrAPI

Repository for version control of the BrAPI specifications
https://brapi.org
MIT License
57 stars 32 forks source link

Potential New API Call For Querying Direct Descendants (Progeny) #151

Closed dauglyon closed 6 years ago

dauglyon commented 6 years ago

Here is a short idea for a new call which returns the direct descendants of a germplasm object.

Request: GET https://{server}/brapi/{version}/germplasm/{germplasmDbId}/progeny

Response: Content-Type:application/json

{ 
    "metadata" : {
        "pagination": {
            "pageSize":0, 
            "currentPage":0, 
            "totalCount":0, 
            "totalPages":0 
        },
        "status": [],
        "datafiles": []
    },
    "result" : {
        "germplasmDbId": "382",
        "defaultDisplayName": "Pahang",
        "femaleParentOf" : ["403", "406", "407", "450"],
        "maleParentOf" : ["402", "404", "405", "408"]
    }
}

Note that female and male parentage is returned separately. In the case of selfing one could add a third list, but it seems to make more sense to simply add that ID to both lists independently.

BrapiCoordinatorSelby commented 6 years ago

Can I suggest a slightly different structure:

...
    "result" : {
        "germplasmDbId": "382",
        "defaultDisplayName": "Pahang",
        "progeny" : [{
            "progenyGermplasmDbId": "403",
            "parentType": "FEMALE"
        }, { 
            "progenyGermplasmDbId": "402",
            "parentType": "MALE"
        }, { 
            "progenyGermplasmDbId": "405",
            "parentType": "SELF"
        }]
    }

This makes the parent/child relationship a little more explicit, though a little more verbose. You could take it a step further and have progeny contain a list of pedigree objects, but I'm not sure if that level of detail is necessary. Also, as a single list, this could be potentially page-able if there are expected to be many progeny (swap out progeny for data)

dauglyon commented 6 years ago

@BrapiCoordinatorSelby I think that makes sense! It also makes the list easier to iterate through.

dauglyon commented 6 years ago

Here is a spec for the call (I'm going to implement it this way for now in sgn), I wasnt sure what format to use so I just copied the germplasm pedigree format. Brapi-progeny.md.txt

cpommier commented 6 years ago

We must decide wether progeny and pedigree are integrated in germplasm/gDbId/mcpd or as a separate call. I don't like multiple calls for a simple card but I don't like either having too many ways of getting the same data.

BrapiCoordinatorSelby commented 6 years ago

@cpommier For progeny in particular, I would like to keep it a separate call. The list of progeny could grow large, which would slow every germplasm call if it were included. The pedigree object is a fixed size (ignoring siblings) and could be included in the germplasm object, but the question becomes this: For the majority of use cases involving germplasm, does the user need complete pedigree information or just a subset of pedigree information? Unfortunately, with a diverse community like BrAPI, you'll probably get different answers depending on who you ask lol. But please open a new issue to add more info pedigree data to the main germplasm object if you feel it would significantly improve performance of your use cases.

More generally, I think of the pedigree/progeny calls as tools for navigating the pedigree tree. As navigational tools, they should be optimized for minimal information and high speed returns. So I think keeping a separate light weight pedigree call is important to achieve that.

Multiple ways to get the same data isn't such a bad thing. With the diversity of use cases we have in the BrAPI community, it is inevitable that we will need to provide options which have different optimizations for call speed, number of calls, and data density (amount of data returned).