lapps-clarin / converter-issues

NO CODE HERE! Issue tracker for converters
0 stars 0 forks source link

Unrecognized field "$schema" in gold-lif (all layers in one view example) #3

Open keighrim opened 6 years ago

keighrim commented 6 years ago

originally from @elahi123 via an e-mail


In the gold-lif (all layers in one view example), there is $schema

"payload": {
    "@context": "http://vocab.lappsgrid.org/context-1.0.0.jsonld",
    "$schema": "http://vocab.lappsgrid.org/schema/1.1.0/lif-schema-1.1.0.json",
    "metadata": {},
    "text": {
      "@value": "Karen flew to New York. She went to see her cousin. \n",
      "@language": "en"
    },
    "views":

it causes an unrecognized field "$schema" (class org.lappsgrid.serialization.lif.Container). exception.

This is becuase @JsonPropertyOrder in the class is as follow: @JsonPropertyOrder(value = {"context", "metadata", "text", "views"})

Please add "$schema": in propoery order in the library or in ignore property in the library.

keighrim commented 6 years ago

also, originally from @elahi123 via an e-mail


I was testing gold lif example of github https://github.com/lappsgrid-incubator/gold-lif/blob/master/src/main/resources/lif-all-1.1.0.json

  1. In Name Entity it is "@type": "http://vocab.lappsgrid.org/Person", I changed it to http://vocab.lappsgrid.org/NamedEntity in my version.

  2. It has all layers but not lemma. So I added 'lemma' in feature dictionary in my version in order to test all layers in my test.

 {
            "id": "tk_0_0",
            "start": 0,
            "end": 5,
            "@type": "http://vocab.lappsgrid.org/Token",
            "features": {
              "word": "Karen",
              "lemma": "Karen",
              "pos": "NNP"
            }
          },
  1. Is the library updated for $schema (discussed in below mail previous msg)?
keighrim commented 6 years ago

also, originally from @elahi123 via an e-mail


in the gold lif file (lif-all-1.1.0.json) the constitution annotation seems like difference from other examples. The lable field is inside feature dictionary.

 {
            "id": "c_0_0",
            "@type": "http://vocab.lappsgrid.org/Constituent",
            "features": {
              **"label": "ROOT"**,
              "children": [
                "c_0_1"
              ],
              "parent": null
            }
          },

But in other example (karon-all.lif) the lable field is outside feature dictionary

{
                        "id": "c_0_0",
                        "@type": "http://vocab.lappsgrid.org/Constituent",
                        **"label": "ROOT"**,
                        "features": {
                            "children": [
                                "c_0_1"
                            ],
                            "parent": null
                        }
                    },

For this reason constitution conversion LIF to TCF fails for lif-all-1.1.0.json

Can you please tell me which one is correct?

keighrim commented 6 years ago

you can use newer LIF library, currently available as a SNAPSHOT version. https://github.com/lapps/org.lappsgrid.all/blob/develop/pom.xml

keighrim commented 6 years ago

That label will be moved inside features map in a new version of LIF schema (1.1.0). So the structure of karen-all.lif will be obsolete.

keighrim commented 6 years ago

copying response from @ksuderman via e-mail


In Name Entity it is "@type": "http://vocab.lappsgrid.org/Person", I changed it to http://vocab.lappsgrid.org/NamedEntity in my version.

I have created a PR to fix it in GitHub as well. Note that the next annotation (Location) is also incorrect; it should be NamedEntity as well. If using your own modified version be sure to add the "category" feature with Person/Location respectively.

It has all layers but not lemma. So I added 'lemma' in feature dictionary in my version in order to test all layers in my test.

{
  "id": "tk_0_0",
    "start": 0,
    "end": 5,
    "@type": "http://vocab.lappsgrid.org/Token",
    "features": {
      "word": "Karen",
      "lemma": "Karen",
      "pos": "NNP"
    }
},

Be sure you add 'lemma' to the 'contains' metadata at the top of the file:

"metadata": {
    "contains": {
            "http://vocab.lappsgrid.org/Token#lemma": {
              "producer": "hand-written-sample",
              "type": "token-lemma"
            },
        ...
    }

Is the library updated for $schema (discussed in below mail)?

The latest SNAPSHOT version(s) should be able to handle the $schema and other recent changes. The SNAPSHOT versions are deployed to the Sonatype OSS repository.

See https://oss.sonatype.org/#nexus-search;gav~org.lappsgrid~all~~~