DARIAH-ERIC / lexicalresources

Data space of the DARIAH Lexical Resources Working Group
https://dariah-eric.github.io/lexicalresources/
BSD 2-Clause "Simplified" License
18 stars 24 forks source link

dcr attributes #5

Closed ttasovac closed 1 year ago

ttasovac commented 6 years ago

One comment in the Google Doc said that we should mention and explain the use of DCR attributes in the narrative section.

I agree — but I don't remember that we discussed this, so this would require quite a bit of work... It would be great if somebody could take it upon themselves to come up with a proposal.

peterstadler commented 1 year ago

just a heads up that the dcr attributes have been changed in TEI release 4.5.0. Quoting from the Readme:

The outdated references to ISOCat have been updated in Chapter 9 Dictionaries, Chapter 18 Feature Structures, gram, and att.datcat. In the course of this the attributes datcat and valueDatcat have been removed from the dcr: namespace and are now native TEI attributes. Additionally, the targetDatcat attribute has been added to the attribute class att.datcat.

For details see https://github.com/TEIC/TEI/issues/2227, https://github.com/TEIC/TEI/issues/1866, and https://github.com/TEIC/TEI/pull/2359

bansp commented 1 year ago

We've had some discussion on this now (Berlin, DARIAH, Lexical Resources Summit 2023), there is a bit of narration in the meeting minutes, and I can extend that for the final docs.

bansp commented 1 year ago

See also TEIC/TEI/issues/2419

bansp commented 1 year ago

@anacastrosalgado @ttasovac Why don't we use this space to put together a case for dcr in taxonomies (the IDs may be sketchy!), and I can then write a PR adding an example to https://tei-c.org/release/doc/tei-p5-doc/en/html/ref-att.datcat.html (and handling at least one typo there at the same time).

One thing that has just occurred to me is whether we understand the issue in the same way: as profiting from being able to fix the items of a <taxonomy> element in the appropriate external data categories. For that, Ana's <taxonomy> would have to be the source of references, rather than the target -- do we have the same picture of that, please?

Or does the picture have Ana's <taxonomy> as the target of references that start in <usg> elements? If that were the case, I would still appreciate an example for the Guidelines, simply to show the usefulness of the DCR attributes for things other than grammatical concepts, BUT, crucially, nothing would need to be added to the current Guidelines, because <usg> can already use these attributes, out of the box (provided that the Lex-0 ODD sources Guidelines versions 4.5 or later).

And in the latter case, it would be the use case by @JessedeDoes (<taxonomy> in ParlaMint) that would constitute the foundation for a request to modify the Guidelines to extend the coverage of DCR attributes (see the TEI-C issue linked immediately above).

(Incidentally, Toma: I was not able to reference Jesse by content-completion, but in the preview it seems like his ID does get linked. Naturally, the practical question is now whether Jesse is going to receive a notification about having been mentioned in this issue.)

bansp commented 1 year ago

I would also suggest removing the "help wanted" label, because it has a special status: it asks the entire DARIAH ERIC for help :-) You can verify that by looking at the front page of the organisation, listing repositories. (Been there myself as well, when it turned out that I was in fact asking the entire CLARIN for help, rather than just the users of the relevant repository...)

ttasovac commented 1 year ago

One thing that has just occurred to me is whether we understand the issue in the same way: as profiting from being able to fix the items of a <taxonomy> element in the appropriate external data categories. For that, Ana's <taxonomy> would have to be the source of references, rather than the target -- do we have the same picture of that, please?

Or does the picture have Ana's <taxonomy> as the target of references that start in <usg> elements? If that were the case, I would still appreciate an example for the Guidelines, simply to show the usefulness of the DCR attributes for things other than grammatical concepts, BUT, crucially, nothing would need to be added to the current Guidelines, because <usg> can already use these attributes, out of the box (provided that the Lex-0 ODD sources Guidelines versions 4.5 or later).

Our use case is usgtaxonomy/category → external ontology. So, we're planning to use @valueDatacat on usg, we want to present a TEI view of the domain labels in the teiHeader and we want to point to the external ontology so that people can study it there, use SPARQL queries on it etc. etc.

anacastrosalgado commented 1 year ago

Here is the example:

<usg type="domain" valueDatacat="#domain.medical_and_health_sciences.medicine">Med.</usg>
<encodingDesc>
    <classDecl>
        <taxonomy xml:id="domains">
            <!--...-->
            <category xml:id="domain.medical_and_health_sciences">
                <catDesc xml:lang="en">Medical and Health Sciences</catDesc>
                <catDesc xml:lang="pt">Ciências Médicas e da Saúde</catDesc>
                <category xml:id="domain.medical_and_health_sciences.medicine"
                    valueDatacat="https://vocabs.rossio.fcsh.unl.pt/pub/morais_domains/pt/page/0025 http://www.semanticweb.org/OntoDomLab-Med#Medicine">
                    <catDesc xml:lang="en">
                        <term>Medicine</term>
                        <gloss><!--...--></gloss>
                    </catDesc>
                    <catDesc xml:lang="pt">
                        <term>Medicina</term>
                        <gloss><!--...--></gloss>
                    </catDesc>
                </category>
            </category>
            <!--...-->
        </taxonomy>
    </classDecl>
</encodingDesc>
daliboris commented 1 year ago

You should use the poniter for the @valueDatcat value, i.e. hashtag (#) plus value of @xml:id of referenced element, if it is in your document. Or you can use URI for referencing external resource.

<usg type="domain" valueDatacat="#domain.medical_and_health_sciences.medicine">Med.</usg>

ttasovac commented 1 year ago

yes, that's what we actually do — we just didn't copy the hashtag by accident. I corrected it above.

bansp commented 1 year ago

Hi @anacastrosalgado , may I ask for more around the <usg> above, please? I would whip up a quick PR, to get things rolling.

bansp commented 1 year ago

Oh but let me stress that there is no rush! :-) I have prepared some text with some examples, and we have time now to elaborate on that.

bansp commented 1 year ago

PR linked, there is space in it for some additions to appropriately reference the Morais project. I used a fragment from the GB section of ParlaMint to round up the linguistic story, with thanks to Jesse for inspiration. and then sketchily described the case of referencing concepts from the <usg> element.

bansp commented 1 year ago

Just a note that the PR got merged today (or yessturday, depending on where you're sitting) into the TEI/dev branch.