consbio / gis-metadata-parser

Parser for GIS metadata standards including ArcGIS, FGDC and ISO-19115
BSD 3-Clause "New" or "Revised" License
20 stars 3 forks source link

Serialization and New Line Characters #6

Closed jlaura closed 3 years ago

jlaura commented 3 years ago

How should one specify using newline characters when serializing xml metadata using custom parsers. For example, I am using the CustomFGDC parser that handles map projections. When I serialize an instance of that object, I am seeing the following:

<metadata>
    <spref><horizsys><planar><mapproj><equirect><longcm>180.0</longcm><stdparll>0.0</stdparll><fnorth>0.0</fnorth><feast>0.0</feast></equirect><mapprojn>Equirectangular EUROPA</mapprojn></mapproj><planci><plance>coordinate pair</plance><plandu>meters</plandu><coordrep><ordres>7346.0</ordres><absres>7346.0</absres></coordrep></planci></planar></horizsys></spref><idinfo>
        <useconst>none</useconst><spdom><bounding><northbc>84.94594418192924</northbc><westbc>0.0</westbc><southbc>-84.77108883435208</southbc><eastbc>0.0</eastbc></bounding></spdom><ptcontac><cntinfo><cntpos>Research Space Scientist</cntpos><cntemail>mbland@usgs.gov</cntemail><cntperp><cntorg>Southwest Region: ASTROGEOLOGY SCIENCE CENTER</cntorg><cntper>Michael T Bland</cntper></cntperp></cntinfo></ptcontac><datacred>NASA's Galileo Mission Solid State Imaging Team</datacred><citation>
            <citeinfo>
                <pubdate>2021</pubdate><origin>Michael Bland</origin><onlink>mydoinumber</onlink><title>Photogrammetrically Controlled Galileo Images of Europa</title><geoform>raster digital data</geoform>
                <pubinfo>
                    <pubplace>Flagstaff, AZ</pubplace>
                    <publish>U. S. Geological Survey</publish>
                </pubinfo>
                </citeinfo>
        </citation>

I included extra output here to show the formatting on the projection fields (which are custom) and also the formatting on the contact fields (which are not custom). How is the code determining when to include a new line or not? Can this be contracted via this package or is this an issue in parseutils (I also did not see a means to be explicit in that library, but I only briefly skimmed the code base looking at the write methods.)

Insight much appreciated!

dharvey-consbio commented 3 years ago

Hi @jlaura,

Thanks for the question. This library uses cElementTree to parse and write XML, specifically the tostring function. It is called from the MetadataParser.serialize method.

There isn't a built-in way to prettify metadata in this library, yet, but here's what you can do to get the results you want:

from xml.dom import minidom

class CustomFgdcParser(FgdcParser):
    ...

    def serialize(self, use_template=False):
        serialized = super(CustomFgdcParser, self).serialize(use_template)
        return minidom.parseString(serialized).toprettyxml(indent='  '))
jlaura commented 3 years ago

@dharvey-consbio Perfect! That is awesome. I'll go ahead and get the changes into my code. Many thanks!