Open qjhart opened 2 months ago
@UcDust this looks good.
@UcDust, I think maybe the easiest thing to do is refactor the sanitize as something like subselect
and accept
expert.subselect(doc, { sanitize:true,
expert:true|false
grants:{ page:1,size:25},
works:null,
})
This doesn't really match the API calls though, but it does allows us to get the default page with:
expert.subselect(doc,{sanitize:true,
expert:true|false
grants:{ page:1,size:25},
works:{page:1,size:25}
});
I do see a problem in that the counts will be affected on the sanitization step. so your cache would have to include both. This is one reason to not have the server guess that I suppose.
@UcDust this looks good.
- [ ] For the works/grants endpoints, Do we have a method to retrieve all the results? (or just size=100000? )
- [ ] how do we want to see the total counts for grants and citations on the expert page?
@qjhart
To retrieve all results, what if we added another param for ?full
or ?all
that would return all grants/works for that expert? Just using a huge size could work too.
For total counts, could we have a structure like:
hits: {
works: {
total: 27,
visible: 24
},
grants: {
total: 7,
visible: 4
}
}
(not sure on the hits
verbiage, but something along those lines maybe?)
@qjhart I created the https://github.com/ucd-library/aggie-experts/compare/dc-api-subselect branch with a start to the sanitize logic changes.
We'll need to optimize more once we analyze the type of sorting we can do on grants/works, and the client needs to be wired in still.
Also, admin mode (and for users own profile) is sending the ?no-sanitize
flag still, which bypasses this logic. So we'll need to think of an approach there, perhaps removing that.
Example: https://experts.ucdavis.edu/expert/48xkGvFK
This the API file is 73M, On my speedy machine this takes ~10s to load. This is the bulk of the time that it uses to load.
There are a number of issues at play here. First, a considerable amount of this data are the 1000s of additional authors that exist for each citation, originally we had an additional modification to
experts cdl
where we would stop authors at 40, (but add the last author). I'm a little bit conflicted on the use of this. Another user (say for example the author) might be interested in seeing all the authors for some specific reason.Another issue is that in most circumstances we are looking at very little of the expert.
If we followed the idea from Fedora, we could add some additional representations on our
Prefer
header, and make some additional limitations on these components. We could limit the page and count, and we could even trim authors from our display.Proposed API Updates
/api/expert/<id>
GET
route, we can add the following:?full
param to the endpoint to return the entire document with all grants/works for the expert/api/expert/<id>/works?page=2&size=25
/api/expert/<id>/grants?page=2&size=25
page
would default to 1 andsize
(number of results) to 25no-sanitize
flag works? It's undecided, but perhaps we should let the API code handle returning data sanitized if the user isn't looking at their own profile (or impersonating their profile) and the user isn't an admin. Should that logic be removed from the client?