geosolutions-it / oaw-georectify-process-airflow

Airflow based project to manage the geoprectify process
1 stars 0 forks source link

Metadata import in GeoNode #10

Closed chpicone closed 3 years ago

chpicone commented 3 years ago

During the import stage (form the airflow workflow) we need to read metadata information and then publish them with the raster image.

chpicone commented 3 years ago

We are ready to update the Airflow GeoNode Upload Operator to add metadata information. New version of our geotiflib (from master) is now able to read XMP information and then store them as items in the GDAL_metadata section. A method from the geotif class can provide a dictionary with the required information.

**Geonode**                         **Tifftags_DC**                             
TITLE                           Title               
Date                            Date (YYYY)                 
Edition                         ???                             
Abstract                        Description                         
Purpose                         Source (URL)                    
Free-text Keywords                  Subject                     
Supplemental information                Relation                                
Data quality statement                  Format                              
Typename                        Identifier

The method oaw_metadata_dict return a dictionary containing the following keys:

chpicone commented 3 years ago

Here an example of the usage: https://github.com/geosolutions-it/oaw-georectify-process-alg/blob/c9b64011a415008de31791cfa90c061d85559fa4/tests/test_geotiff.py#L236

chpicone commented 3 years ago

I am asking the customer to understand for the field "Free-text Keywords" if we have to split the string contained in the OAW_SUBJECT field by the ";" character.

chpicone commented 3 years ago

Customer is using hierarchical keywords so the ";" character separates words in different hirarchical levels: image

@mattiagiupponi we need to ask someone of the GeoNode group if it is possible to call the rest api in someway for this purpose.

mattiagiupponi commented 3 years ago

Regarding the metadata upload, looks like that the XML file is the best approach from a GeoNode perspective. In few words:

  1. convert the metadata extracted by the library from JSON to XML
  2. in the OAW GeoNode project (https://github.com/geosolutions-it/oaw-geonode) write a custom metadata parser to handle the metadata sent by the API with a custom parser ( GNIP: https://github.com/GeoNode/geonode/issues/7263 DOC: https://docs.geonode.org/en/master/basic/settings/index.html?#metadata-parsers ) already available on the master branch

Needs to be evaluated

mattiagiupponi commented 3 years ago

At the moment the metadata are mapped in this way:

{
    "title": metadata.get('title', None),
      "date": metadata.get('date', None),
     "edition":  metadata.get('edition', None),
     "abstract": metadata.get('description', None),
     "purpose": metadata.get('source', None),
     "keywords": [k.replace(' ', '') for k in metadata.get('subject', None).split(';')],
     "supplemental_information":  metadata.get('relation', None),
     "data_quality_statement": metadata.get('format', None),
     "typename": metadata.get('identifier', None),
}