ckan / ckanext-dcat

CKAN ♥ DCAT
164 stars 142 forks source link

Expose dataset metadata as JSON-LD for Google Dataset Search #190

Closed maxclac closed 5 months ago

maxclac commented 3 years ago

Hi everyone!

I installed the DCAT extension on CKAN 2.9 and I would like to expose my datasets to Google Dataset Search by using the JSON-LD endpoint. I am using the ckanext-scheming extension in order to create customized fields.

I would like to have something like in this example here.

I am particularly interested in the temporalCoverage and spatialCoverage fields. I implemented them in my schema file but I don't see them in the JSON-LD file created by the DCAT extension. I probably missed something.

Here is how I implemented my temporalCoverage field:

    {
      "field_name": "temporals",
      "label": "Temporal coverage",
      "repeating_subfields": [
              {
                "field_name": "startDate",
                "label": "Start Date",
                "display_property": "schema:startDate"
              },
              {
                "field_name": "endDate",
                "label": "End Date",
                "display_property": "schema:endDate"
              }
      ]
    }

Can anyone help?

amercader commented 3 years ago

@maxclac Looks like the SchemaOrg profile used to generate the JSON-LD snippet that Google Dataset Search will parse expects temporal_start and temporal_end to be the field names. Can you try to change the field names to see if that works? If you can't change the names you will need to create a custom profile that parses your fields in a similar way.

Note that last time I worked on this Google Dataset Search it didn't support parsing schema.org JSON-LD from a linked file, it needed to be embedded in the source of the page. That's what the structured_data plugin does, using the profile I linked above.

maxclac commented 3 years ago

Thank you @amercader. I changed the field names but it did not help. I do not need to use schemas from schema.org. Any will do, as long as I can have my fields in the JSON-LD file. If I could manage this without having to write my own custom profile, it would be great, because I don't have enough understanding of CKAN yet to be able to do this.

maxclac commented 3 years ago

Update: it works when I put startDate and endDate as two separate fields and not as subfields of temporals:

    {
      "field_name": "temporal_start",
      "label": "Start Date",
      "display_property": "schema:startDate"
    },
    {
      "field_name": "temporal_end",
      "label": "End Date",
      "display_property": "schema:endDate"
    }

I think that I simply need to go through this SchemaOrgProfile and see what it expects as field names and maybe modify it according to my needs.