Open jacobdgm opened 2 years ago
@jacobdgm
The little note I found in the current manuscript import script in CU suggests that Cantus DB implements an API -- do you know of any docs about this? I realize that your docs also might be about new cantus, and that old cantus might be different.
I'm not aware of any documentation for OldCantus's APIs. There is some documentation on their implementation in NewCantus on the CantusDB Wiki - please let me know if this can be improved in any way. And also let me know if there's additional information you would like to have provided by an API - if it's a JSON API, I should be able to add keys without breaking things, and I can also put together something new if it's needed.
Well, it just seems to me that an API would be the ideal way for CU to get the manuscript data from CantusDB...it makes much more sense to me to request a json document from CantusDB than script the html as we are currently doing.
In which case, I think there are two stages to this:
Thoughts?
Well, it just seems to me that an API would be the ideal way for CU to get the manuscript data from CantusDB...it makes much more sense to me to request a json document from CantusDB than script the html as we are currently doing.
Yes, I agree.
@jacobdgm
I've looked a little more in the CantusDB endpoints, and it looks like the json-node/<source_id>/
endpoint will give us what we need.
I have two questions//comments//confirmations:
/sources/
url returns an html document. In other words, it looks like currently I would still need to parse html to get all the source id's in the first place. Does that seem true to your understanding?json-node
API about potential future changes. It doesn't seem to me like that would cause major issue (eg. maybe down the road we need to change the url path or something, but nothing that would render using the api for this purpose unusable). Does that seem correct based on your understanding?It doesn't seem like there is an endpoint that would pass back all the id's of a certain type (eg. all the sources in cantusdb), and it looks like a query to the /sources/ url returns an html document. In other words, it looks like currently I would still need to parse html to get all the source id's in the first place. Does that seem true to your understanding?
Yes, this sounds right. This is actually the main chokepoint to us syncing data with OldCantus - in our documentation on how to do this, it involves connecting to the OldCantus server and running SQL commands, e.g. "To obtain a list of all sources' IDs, run SELECT nid FROM node WHERE 'type'='source';
in mysql on the old Cantus server. "
A much better approach would be to have a /sources-list/
(or something similar) API that lists the IDs associated with sources. Is there anything we would want in this API other than a list of IDs? (I should ask Jan to set this up for OldCantus, come to think of it - it would simplify things quite a bit)
There is a note in the documentation of the json-node API about potential future changes. It doesn't seem to me like that would cause major issue (eg. maybe down the road we need to change the url path or something, but nothing that would render using the api for this purpose unusable). Does that seem correct based on your understanding?
I don't think it would cause a major issue, no. We'd just have to set up a URL for exporting sources that's different from the current, export-anything URL.
A much better approach would be to have a
/sources-list/
(or something similar) API that lists the IDs associated with sources. Is there anything we would want in this API other than a list of IDs? (I should ask Jan to set this up for OldCantus, come to think of it - it would simplify things quite a bit)
I don't think so, because once I have the ID, I feel like I can get anything else I need from the /json-node/
endpoint. I guess if you returned source id's and other source information at once it would me fewer API calls from CU, but I'm not sure it's worth effort -- from the CU perspective we'd only call the API initially and when sources change.
I don't think it would cause a major issue, no. We'd just have to set up a URL for exporting sources that's different from the current, export-anything URL.
Perfect!
I mentioned this at the lab meeting - we definitely want to renumber sources in Cantus Ultimus to match those in Cantus Database. Once I finish the main project I have on the go - testing NewCantus staging and putting it up on production - I'll set up an API for this.
Since my plan is to implement this change in the coming days, I'm going to reiterate my approach here, since it is slightly different that in my summary above.
Once a source list api is available...
I was about to start writing one fresh, but saw we already have a json-sources
API. You can find it on OldCantus, Production and Staging, and there's a bit of documentation on the CantusDB Wiki. NewCantus's implementation returns only published sources; I believe OldCantus does the same. Is there anything that you'd want an API to do that this one doesn't already do?
Not sure why we didn't see that one before.... but yeah, that works.
So now:
related to Issue 429 on the CantusDB github page, I figured it would be a good idea to open the issue here, where the impactful change would actually occur.
Since Cantus Ultimus is more-or-less a nice user interface over top of the data on CantusDB, it might make sense to change the manuscript identifier numbers in Cantus Ultimus to match those in CantusDB - for example, CH-E 611 is currently 74 in CU, but 123606 in CD. This would make it easy to link to a manuscript from CantusDB, and it also feels like a thing that would generally make integration between the two sites simpler in other situations.
(I have not looked through all of the issues that are open on this repository, so if this is a duplicate, feel free to close it)