Closed ReneRanzinger closed 1 year ago
Proposed strategy The details pages do not include data from server side paginated sections. Instead the details page has an "additional_data" property that is an array similar to the "tool_support" array on glycan details. It lists all the server side paginated sections with the number of records. Frontend loads the details JSON and based on the array triggers webservice calls to load the first "page" for each server side paginated data table. If the number in the "additional_data" array for a data type is 0 it will not trigger a webservice call.
There could be (a) one webservice for each server side paginated data type or (b) one general webservice that takes additional parameters. Although more work and more webservices, my preference is (a). Its easier to document (for external users) and the JSON schema for the response JSON can be specific for the datatype. Which will allow better (automated) error checking on the response. It will also make the sorting less complex since you can only sort by properties present in the current table.
Assuming (a): All webservices need the ID for the details page (GlyTouCan ID, UniprotAcc) the offset, limit, sort by and sort direction. The response should include the position metric (total number, offset, limit) and the data records.
Open questions
@rykahsay and @sujeetvkulkarni please review and lets finalize this tomorrow.
Glycosylation card pagination - Need to manage pagination for each tab by server. Card level summary - Glycosylation Summary: 5 site(s) total, 55 N-linked annotation(s) at 4 site(s), 3 O-linked annotation(s) at 1 site(s) Tab level summary - Summary: 5 site(s) total, 14 N-linked glycan(s) at 4 site(s), 2 O-linked glycan(s) at 1 site(s)
Both card and tab level summary information needs to come from server.
In my opinion initial data (eg. max 20 entries per table)for all the paginated tables should come from details api, and further webservice call should only be triggered in case user clicks on different pages. This would reduce api calls and page load time.
Conclusion of the developer meeting on 4/12/2023
Details web service changes When requesting the JSON from a details web service frontend will send a list of data properties that should be server side paginated (@sujeetvkulkarni will provide an example how this looks like). The details JSON response has an "section_stats" property that is an array similar to the "tool_support" array on glycan details. It lists all the server side paginated sections with the number of records. Some section such as the glycosylation section will require a nested object to account for the total summary
Glycosylation Summary: 5 site(s) total, 55 N-linked annotation(s) at 4 site(s), 3 O-linked annotation(s) at 1 site(s)
and the summary shown on each of the tabs
Summary: 5 site(s) total, 14 N-linked glycan(s) at 4 site(s), 2 O-linked glycan(s) at 1 site(s)
@sujeetvkulkarni provide an example how the glycosylation section of the "section_stats" should look like.
By default all tabular sections are server side paginated. And no data of these sections is in the JSON except for the "section_stats". If the API call contains a list of sections that should be paginated (wishlist) two things change:
Pagination webservice We will try to implement a general pagination webservice for all types of data. The downside will be that the JSON schema will have to be very general to cover all the different types of tabular data. This will make automated error checking of the response based on the schema very hard. Another issue is that the sorting will be more complex. Since the sorting options are dependent on the data type.
Input the webservice:
The response should include the position metric (total number, offset, limit) and the data records (similar to list pages)
@sujeetvkulkarni
@rykahsay
@ReneRanzinger @rykahsay Please review below object for glycosylation summary Provide an example how the "section_stats" should look like for the glycosylation data.
"section_stats" :{
"phosphorylation": 1500,
"mutagenesis" : 200,
"glycosylation_reported_with_glycans": 100,
"glycosylation_reported": 55
...
"glycosylation_summary": {
"total_sites": 5,
"n_linked_annotaions": 14,
"n_linked_annotaion_sites": 14,
"o_linked_annotaions": 14,
"o_linked_annotaion_sites": 14,
"reported_with_glycans": {
"total_sites": 5,
"n_linked_glycans": 14,
"n_linked_glycan_sites": 14,
"o_linked_glycans": 14,
"o_linked_glycan_sites": 14
},
"reported": {
"total_sites": 5,
"n_linked_annotaions": 14,
"n_linked_annotaion_sites": 14,
"o_linked_annotaions": 14,
"o_linked_annotaion_sites": 14
},
"predicted": {
"total_sites": 5,
"n_linked_annotaions": 14,
"n_linked_annotaion_sites": 14,
"o_linked_annotaions": 14,
"o_linked_annotaion_sites": 14
},
"text_mining": {
"total_sites": 5,
"n_linked_annotaions": 14,
"n_linked_annotaion_sites": 14,
"o_linked_annotaions": 14,
"o_linked_annotaion_sites": 14
}
}
}
@ReneRanzinger @rykahsay Please review below object for details api with paginated tables wish list. Provide an example how a details webservice call will look like to include the "wishlist" of paginated sections
current api call : https://api.glygen.org/protein/detail/P14210-1
Proposed example with paginated tables list,
/protein/detail?query={
"uniprot_canonical_ac":"P14210-1",
"offset":1,
"limit":20,
"order":"asc",
"paginated_tables":[
"glycosylation_with_glycans",
"glycosylation_reported",
"glycosylation_predicted",
"glycosylation_text_mining",
"phosphorylation",
"publication"
]}
@sujeetvkulkarni we need confirm with @rykahsay but I would prefer to keep the API call semantic (including the protein ID). The other problem is the limit and sort criteria for each of the tables can be different. I think it has to be a list of objects rather than strings.
Making paginated_tables list of objects so that each table can have different offset, limit, order. We dont need an offset field as every table in details api will start from 1 and for retrieving next results a separate table specific api will be called but keeping it for consistency. Different sort keys for each table needs to be defined.
/protein/detail/P14210-1?query={
"uniprot_canonical_ac":"P14210-1",
"paginated_tables":[
{
"table_id": "glycosylation_with_glycans",
"offset":1,
"limit":20,
"sort": "key",
"order":"asc",
},
{
"table_id": "glycosylation_reported",
"offset":1,
"limit":20,
"sort": "key",
"order":"asc",
},
{
"table_id": "glycosylation_predicted",
"offset":1,
"limit":20,
"sort": "key",
"order":"asc",
},
{
"table_id": "glycosylation_text_mining",
"offset":1,
"limit":20,
"sort": "key",
"order":"asc",
},
{
"table_id": "phosphorylation",
"offset":1,
"limit":20,
"sort": "key",
"order":"asc",
},
{
"table_id": "publication",
"offset":1,
"limit":20,
"sort": "key",
"order":"asc",
}
]}
@sujeetvkulkarni @ReneRanzinger ...
Can we change the "section_stats" as follows for to give a consistent structure (and less fields in the schema)?
{
"section_stats" :[
{
"table_id":"glycosylation_reported_with_glycans",
"table_stats":[
{"field": "total", "count": 100}
,{"field": "total_sites", "count": 5}
,{"field": "n_linked_glycans", "count": 14}
,{"field": "n_linked_glycan_sites", "count": 14}
,{"field": "o_linked_glycans", "count": 14}
,{"field": "o_linked_glycan_sites", "count": 14}
]
},
{
"table_id":"glycosylation_reported",
"table_stats":[
{"field": "total", "count": 100}
,{"field": "total_sites", "count": 5}
,{"field": "n_linked_glycans", "count": 14}
,{"field": "n_linked_glycan_sites", "count": 14}
,{"field": "o_linked_glycans", "count": 14}
,{"field": "o_linked_glycan_sites", "count": 14}
]
},
{
"table_id":"phosphorylation",
"table_stats":[
{"field": "total", "count": 100}
,{"field": "xx", "count": 100}
,{"field": "yyy", "count": 100}
]
}
]
}
@rykahsay that is fine with me but we also need a glycosylation summery (total for all glycosylation). Do you want to make a table_id "glycosylation" or "glycosylation_summery" for this? Please do not comment on closed tickets. Reopen the tickets otherwise it will be "hidden" by default and we may miss that there is something to do.
@rykahsay it's fine, but like Rene said please include glycosylation summery (total for all glycosylation) details.
How many web services do we need? How is this linked to the details service in the beginning? If omitted how to notice that there is no data at all?
Blocker for:
246
247
521