provide access to eXist fundocs app

ingoboerner commented 1 year ago

In the eXist dashboard there is an app "XQuery Function Documentation" preinstalled. It allows for generating a documentation of all XQuery functions used based on the function annotations. IMHO it would be good to allow a user of DraCor understand what is going on under the hood of the API, e.g. how data is extracted from the TEI with the dutil: functions. A first step would be to generate the documentation upon deployment and make it (or selected modules) accessible like http://localhost:8080/exist/apps/fundocs/view.html?uri=http://dracor.org/ns/exist/util&location=/db/apps/dracor/modules/util.xqm

A second step would involve adding additional documentation to the function annotations, e.g. explaining, which elements are evaluated, where data is extracted and maybe reference the ODD from there.

mathias-goebel commented 1 year ago

I suggest not providing the whole fundocs app, but instead use its rendering functions to create the HTML documentation that can be embedded to the DraCor site.

why this? the fundocs app incorporates (off-scale outdated) libraries know for vulnerabilities and will be very hard to update.:

"bootstrap": "^3.4.1",
"jquery": "^1.12.4",
"marked": "^2.1.1",
"prismjs": "^1.23.0"

(from their package.json)

to publish this endpoint is like moving server infra to windows 95.

with great respect to the work we might take just what we need (citing the source, of course) and use the rendering feature as described above. in addition providing the whole app will expose all other xqdocs and thus might rate down DraCor at search engines because of providing redundant copies of content.

ingoboerner commented 1 year ago

OK, thanks for that input. I was just thinking of the easiest way to get the documentation, because of the work, that would already be involved in updating and enhancing the function annotations of the relevant functions. But maybe your suggestion would be a better approach. So where to start looking? https://github.com/eXist-db/function-documentation/blob/master/src/main/xar-resources/modules/scan.xql

This obviously seems to provide the browse functionality? https://github.com/eXist-db/function-documentation/blob/master/src/main/xar-resources/modules/app.xql#L77-L81

This seems to render a single module: https://github.com/eXist-db/function-documentation/blob/master/src/main/xar-resources/modules/app.xql#L114-L187

This a single function: https://github.com/eXist-db/function-documentation/blob/master/src/main/xar-resources/modules/app.xql#L189-L276

It seems even to support markdown in the annotations, which would help in adding links to the ODD...

ingoboerner commented 1 year ago

and this is the specification of the function annotations, right?

http://xqdoc.org/index.html

mathias-goebel commented 1 year ago

maybe we can omit the browse function. in scan.xql a node tree according to xqdoc is created that can be passed for rendering either a module or a single function. this will result in HTML to be included in the website.

in the description section of xqdoc you can enter HTML as well. i guess this is even more easy to include than adding a markdown parser.

cmil commented 1 year ago

Before we start documenting XQuery functions that are not part of our public facing API I would suggest augmenting the documentation of our main contract, that is the DraCor REST API. The documentation of the functions that are not in the api.xqm module as it is right now is not very elucidating without also looking at the code, especially the XPath constructs therein. We would have to put significant effort into extending the XQDocs to turn them into a useful resource for understanding DraCor. I would prefer to put that effort into comprehensively describing the JSON (and CSV) output of the REST API. From there we could even link into the code on GitHub and achieve a very rich documentation that would help both consumers and developers of the API.

Also, even if the XQDocs were already useful enough to publish them, we would have to come up with some way to extract them, a workflow to integrate them into dracor-frontend, and some styling to fit them into the DraCor design. That's quite some work. Why don't we instead pick up things where we left them with openapi4restx? (@mathias-goebel has there been any development?) We have an open issue #59, there is already the /openapi endpoint. If we extend openapi4restxq so that it can extract all the information we need from the RESTXQ annotations in api.xqm we could continue to improve those annotations and have an always up-to-date automatically generated API documentation. We could stop maintaining api.yaml manually and switch to the /openapi endpoint for https://dracor.org/doc/api.

I'm not against improving the XQuery function documentation. But for now I think it is accessible enough for those who can benefit from it, i.e. developers of dracor-api who can always use the fundocs app in a local eXist instance. If we want to improve the overall public facing DraCor API documentation, I would argue there are better, more efficient ways to do that.

ingoboerner commented 1 year ago

The documentation of the functions that are not in the api.xqm module as it is right now is not very elucidating without also looking at the code, especially the XPath constructs therein.

agree. For the CLS INFRA report I am looking at the code and, true, the xPaths are key.

I think, we should communicate with the documentation, that there is 1) something encoded in the TEI, 2) this gets extracted by some means (XQuery) and 3) ends up in the API output. At the moment, I think, this is not transparent and just pointing to the code on github will not be enough, because, to be honest, XQuery isn't the most common of programming languages.

The best thing I could come up with for now, is a spreadsheet where I can, at least, collect the respective XQuery functions, the XPath, that extracts information and record the field in the (JSON) response:

https://docs.google.com/spreadsheets/d/1fgvexLfoJF-elYunQ_R0rvYNXZRKhgSZPrI6dOu9ekA/edit?usp=sharing

The idea is to come up with a list of all features about the central entities, i. e. corpus and play, probably character and relate these features to the API endpoints.

@cmil How would you augment the information in the spreadsheet with the OpenAPI Specification? Put it into the description field? And, yes, the schemata of the response objects. When using flask and apispec I could use marshmallow schemas to describe the response objects that are automatically included in the Open API Spec, but with XQuery... So, you mean, they need to be maintained manually, right? Because where would you add something like schema to the XQuery function annotation?

In the very end I would also like to see a way to integrate OpenAPI and the ODD, but don't have a clue how to achieve that.

I think, the aim should be to have one source of documentation and generated the rest out of it. What I see at the moment, the closest thing to what we get out in the end on the side of the API is the XQuery code, so I thought, maybe start documenting there? And I thought, implementation-whise it would be relatively easy to achieve (because the thing to publish ist, fundocs, is already there). I understand that the real work is adding the annotations to the XQuery functions, but that would not involve any more development.

dracor-org / dracor-api

provide access to eXist fundocs app #184