Signbank / Global-signbank

An online sign dictionary and sign database management system for research purposes. Developed originally by Steve Cassidy/ This repo is a fork for the Dutch version, previously called 'NGT-Signbank'.
http://signbank.cls.ru.nl
BSD 3-Clause "New" or "Revised" License
19 stars 12 forks source link

Add labels and description to JSON #1225

Open rem0g opened 3 months ago

rem0g commented 3 months ago

We need extra fields in JSON for every gloss: labels description "Voorbeeldzinnen"

susanodd commented 3 months ago

By labels, you mean the Tags, right? I can add the Tags to all api.

Are the other two the ones being added in #1178 ?

susanodd commented 3 months ago

Fixed it!

susanodd commented 3 months ago

The package functionality has Tags now. It's on signbank.

Jetske commented 3 months ago

@rem0g with "voorbeeldzinnen" you mean example sentences for senses, right? Not the new annotated sentences.

uklomp commented 2 months ago

Just talked to Susan & Wessel and I'm pretty sure @rem0g means tags indeed, also a field called 'aantekeningen' (Dutch) or 'notes' (English), and a connection to the newly invented annotated sentences. Side note: When we talk about example sentences, we will actually never mean the originally developed voorbeeldzinnen with the senses, but almost always the newly developed panel for annotated sentences.

Jetske commented 2 months ago

@uklomp Okay! Then from now on I'll assume when you talk about voorbeeldzinnen that it's about the annotated sentences.

Voorbeeldzinnen would look something like this in json:

{
       "voorbeeldzinnen" : [
              "zin_id_123" : {
                    "sentence_nl" : "Dit is een voorbeeldzin",
                    "sentence_en" : "This is an example sentence",
                    "context_nl" : "Dit is een voorbeeldcontext",
                    "context_en" : "This is an example context",
                    "eaf_file" : [some link?],
                    "video_file" : [some link?],
                    "glosses" : [
                           "gloss_id_123" : {
                                  "begin" : 2000,
                                  "end" : 3000,
                                  "representative" : True
                           },
                           "gloss_id_234" : {
                                  "begin" : 3000,
                                  "end" : 4000,
                                  "representative" : False
                           }, 
                           ...
                    ]
              },
              ...
       ]
}

@rem0g This means voorbeeldzinnen are linked almost always to multiple glosses and glosses in return may have multiple voorbeeldzinnen.

I don't understand what to do here. What do you need the json field to look like?

rem0g commented 2 months ago

I am going to close this one and continue: https://github.com/Signbank/Global-signbank/issues/1203

susanodd commented 1 month ago

@rem0g I'm adding the Notes (aka description) to the package functionality. The notes have the syntax:

Type: (Published, Index, Text) where Type is a choice during entry.

Do you want the entire tuple?

Normally, they look like this:

    "46057": {
        "Lemma ID Gloss: Dutch": "97",
        "Lemma ID Gloss: English": "97",
        "Annotation ID Gloss: Dutch": "97",
        "Annotation ID Gloss: English": "97",
        "Senses: Dutch": {
            "1": "97"
        },
        "Strong Hand Number": "True",
        "In The Web Dictionary": "True",
        "Is This A Proposed New Sign?": "False",
        "Exclude From Ecv": "False",
        "Repeated Movement": "False",
        "Alternating Movement": "False",
        "Link": "http://localhost:8000//dictionary/gloss/46057",
        "Notes": "Source: (True,1,Number produced by Henk Betten in 2011 as one of three elderly Groningen signers for BA thesis of Maike van Raam at Radboud University; supervision Onno Crasborn)",
        "Affiliation": [
            "Radboud"
        ]
    },

If the notes do not have a Role entered, a dash is shown, as seen here:

    "47523": {
        "Lemma ID Gloss: Dutch": "ALDI-B",
        "Lemma ID Gloss: English": "ALDI-B",
        "Annotation ID Gloss: Dutch": "ALDI-B",
        "Annotation ID Gloss: English": "ALDI-B",
        "Senses: Dutch": {
            "1": "aldi"
        },
        "Senses: English": {
            "1": "aldi"
        },
        "Handedness": "2s",
        "Strong Hand": "5",
        "Weak Hand": "5",
        "Location": "Head",
        "In The Web Dictionary": "False",
        "Is This A Proposed New Sign?": "False",
        "Exclude From Ecv": "False",
        "Relation Between Articulators": "Next-to",
        "Relative Orientation: Movement": "Ulnar",
        "Relative Orientation: Location": "AO: palm-forwards",
        "Orientation Change": "Supination",
        "Handshape Change": "Closing",
        "Repeated Movement": "False",
        "Alternating Movement": "False",
        "Movement Shape": "Spiral",
        "Movement Direction": "Ipsilateral",
        "Link": "http://localhost:8000//dictionary/gloss/47523",
        "Tags": [
            "video: refilm"
        ],
        "Notes": "-: (False,3,Dit gebaar kan als discriminerend worden ervaren. Het gebaar verwijst naar de hoofddoek die de medewerkers bij ALDI dragen.)",
        "Affiliation": [
            "UvA"
        ]
    },
susanodd commented 1 month ago

@rem0g The JSON tuples above are strings, not tuples. It doesn't seem to work to generate just tuples, it converts them to lists in the zipped file. Otherwise, they need to be converted to string representations of tuples.

Do you want all of these fields of the tuple? Or more syntax to make them dictionaries with a key for each part of the tuple? (See the Gloss Edit of Notes to see what is going on with these. It is structured information being displayed here.)

It probably needs to be a dictionary for each Note, otherwise it may not be possible to parse them elsewhere if the text contains commas.


        "Notes": [
            [
                "Source",
                "True",
                "1",
                "Number produced by Henk Betten in 2011 as one of three elderly Groningen signers for BA thesis of Maike van Raam at Radboud University; supervision Onno Crasborn"
            ]
        ],
        "Notes": [
            [
                "-",
                "False",
                "3",
                "Dit gebaar kan als discriminerend worden ervaren. Het gebaar verwijst naar de hoofddoek die de medewerkers bij ALDI dragen."
            ]
        ],
susanodd commented 1 month ago

Here it's modified to dictionaries for each note:

        "Notes": [
            {
                "Published": "True",
                "Index": "1",
                "Type": "Source",
                "Text": "Number produced by Henk Betten in 2011 as one of three elderly Groningen signers for BA thesis of Maike van Raam at Radboud University; supervision Onno Crasborn"
            }
        ],
        "Notes": [
            {
                "Published": "False",
                "Index": "3",
                "Type": "-",
                "Text": "Dit gebaar kan als discriminerend worden ervaren. Het gebaar verwijst naar de hoofddoek die de medewerkers bij ALDI dragen."
            }
        ],

I'm going to use this json, although cumbersome it has less chance for parsing glitches for users.

susanodd commented 1 month ago

This is deployed now on signbank. The third representation has been used.