render TEI list from projectconfig

ctot-nondef commented 7 months ago

in path projectConfig.static_data.table there is an array of TEI headers, wich needs to be rendered to a list of links

caption: teiHeader.fileDesc.titleStmt.title$ link: @id

List view should open in window type DataList, menu point should look like so

            {
              "id":"liSamplesList",
              "target":"SamplesList",
              "title":"List all entries",
              "type":"item",
              "targetType":"DataList",
              "label":"List all entries",
              "params":{
                "listref":"SamplesList",
                "endpoint":"https:\/\/github.com\/acdh-oeaw\/vicav-content"
              }
            }

simar0at commented 7 months ago

I suggest to reuse most of the parameter we already have:

{
    "id": "liProfilesList",
    "target": "vicav_profiles",
    "title": "List all entries",
    "type": "item",
    "targetType": "DataList",
    "label": "List all entries",
    "params": {
        "textId": "vicav_profiles",
        "teiSource": "https:\/\/github.com\/acdh-oeaw\/vicav-content",
        "targetType": "Profile"
    }
}

I am not particularly happy with textId but it is a lookup on an @id. The endpoint for a list of Profiles is hard coded in the Profile component. The extension of params is now easy I think so we can add theer what we need. The textId needed for each profile is in @id in the static_data list

MauPalantir commented 7 months ago

TitleStmnt/Title is usually not very informative in this case, as it is auto gererated like "A list of linguistic features for XXX Ararbic". For current Tunocent data (features, sample texts), speaker IDs are the most informative along with several layers of grouping by region and place. For others like corpus texts, probably some kind of text ID and maybe topic or location, I don't know. We'll have to come up with a flexible solution, maybe agree on generating better titles, I don't know.

To be honest, I don't like the idea of duming unprocessed teiHeader into a huge json as for Tunocent, we get a 22M projectConfig file this way with alot of unused bulk. We would need a cleaner output with relevant metadata and labels converted to a simple flat json.

simar0at commented 6 months ago

I agree, this is a huge amount of data for what we actually will use. But then it is a generic approach so too much data is expected.
One question is: Can we make it hurt less to have this data around. That would mean we are flexible in what we use, we just need to find data we need in the teiHeader or store it there.
The other thing is: Can we auto genereate more useful titles. I think that we now see that we at some point generated bad titels. I think this is also worth fixing.

MauPalantir commented 6 months ago

@simar0at the point of having an API is having something that processes through this data structure and yields something that is easier (and less resource-consuming) to handle on the client side. So for example, Christoph won't have to think about how a tei header looks like and whether the VICAV data scheme has changed or not.

Ultimately any flexibility is constrained by what you actually support on the frontend.

The location of most meaningful information in the TEI headers is pretty deterministic (it should be to get meaningful results on the frontend) and project-specific flexibility can be added by a project-specific processing step on the backend.

I agree that titles should be menaningful, but the type of title that looks good as an actual window title, is not necessarily the same what is meaningful as a label for data lists (and these might differ across projects, as for some, the most important data point info might be location, as for others, speaker). I think "Sample text of X arabic" is great as a window title, but redundant as a label".

I would try this approach:

Introduce a pre-processing XSLT for the tei header endpoint (as a setting or as an alternative endpoint) which creates a flat or close to flat structure of the relevant metadata for all projects.
If we think there will be lots of project speficic fields, we can introduce a step afterwards, where we also run a project specific XSLT to add or modify fields based on project requirements.
The end result is a simple, 1-2-3 levels deep JSON which can be consumed at once either as part of projectConfig or downloaded separately in the background after projectConfig is loaded.

For example, the current TUNOCENT search and list organization requirements can be satisfied with this simple structure: { place: {region, name, country}, person: {age, sex, identifier}, label, dataType }

acdh-oeaw / vicav-vue3

render TEI list from projectconfig #132