consbio / gis-metadata-parser

Parser for GIS metadata standards including ArcGIS, FGDC and ISO-19115
BSD 3-Clause "New" or "Revised" License
20 stars 3 forks source link

Writing secondary properties #7

Closed jlaura closed 3 years ago

jlaura commented 3 years ago

Docs do a great job describing secondary lookup locations, e.g. :

FGDC_DEFINITIONS = dict({k: dict(v) for k, v in iteritems(COMPLEX_DEFINITIONS)})
FGDC_DEFINITIONS[CONTACTS].update({
    '_name': '{_name}',
    '_organization': '{_organization}'
})

This then supports reading from a path such as cntorgp/cntorg in a FGDC metadata file. Docs also indicate that secondary properties are parsed, but not validated or written.

Does any method exist to set the secondary as the primary? For example, compliant FGDC has either a primary contact person or a primary contact organization. Right now, I am not sure how to serialize the latter as a primary contact organization is specified as the secondary property. The result is XML like so on write:

<cntinfo>
        <cntvoice>555-555-5555</cntvoice>
        <cntperp>
          <cntorg>My Organization</cntorg>
        </cntperp>
        <cntorgp>
          </cntorgp>
        <cntaddr>
          <country>US</country>
          <postal>00000</postal>
          <state>??</state>
          <city>Home</city>
          <address>My Address</address>
          <addrtype>mailing address</addrtype>
        </cntaddr>
      </cntinfo>

The desired output is:

<cntinfo>
        <cntvoice>555-555-5555</cntvoice>
        <cntorgp>
          <cntorg>My Organization</cntorg>
        </cntorgp>
        <cntaddr>
        ...
dharvey-consbio commented 3 years ago

I would try something like this:

class CustomParser(FgdcParser):

    def _init_data_map(self):
        if self._data_map is not None:
            return

        super(CustomParser, self)._init_data_map()

        # Overridden to reverse primary and secondary contact paths
        self._data_structures[CONTACTS] = format_xpaths(
            FGDC_DEFINITIONS[CONTACTS],

            name=ct_format.format(ct_path='cntorgp/cntper'),
            _name=ct_format.format(ct_path='cntperp/cntper'),  # If not in cntorgp

            organization=ct_format.format(ct_path='cntorgp/cntorg'),
            _organization=ct_format.format(ct_path='cntperp/cntorg'),  # If not in cntorgp

            position=ct_format.format(ct_path='cntpos'),
            email=ct_format.format(ct_path='cntemail')
        )
jlaura commented 3 years ago

:+1:

Thank you for the info. I am going to close as this solution is working for me right now. I will also spend some time thinking about how to handle FGDC contacts that occur in 3+ different places and how that might be as elegantly handled as this library handles other properties. And then how it might be possible to have a primary contact that is the organization and a producer contact that is a specific person.