sshaw / ddex

DDEX metadata serialization for Ruby
https://metadatagui.com
52 stars 41 forks source link

Invalid XML generation (wrong elements order) [ERN V3.2] #16

Open ancorcruz opened 5 years ago

ancorcruz commented 5 years ago

ERN v3.2 XSD defines XML sequences for some elements (enforcing order of elements), I bet later versions of the standard do it as well.

DDEX.write generates XML with the elements in the wrong order, for example:

<DisplayArtist>
  <ArtistRole>MainArtist</ArtistRole>
  <PartyName>
    <FullName>Bruno Mars</FullName>
  </PartyName>
</DisplayArtist>

however, the valid XML is:

<DisplayArtist>
  <PartyName>
    <FullName>Bruno Mars</FullName>
  </PartyName>
  <ArtistRole>MainArtist</ArtistRole>
</DisplayArtist>

or

<ImageDetailsByTerritory>
  <TechnicalImageDetails>
    <TechnicalResourceDetailsReference>T12345</TechnicalResourceDetailsReference>
    <ImageCodecType>JPEG</ImageCodecType>
    <ImageHeight>800</ImageHeight>
    <ImageWidth>800</ImageWidth>
    <File>
      <FileName>ABCDEF_T12345.jpg</FileName>
      <FilePath>resources/</FilePath>
      <HashSum>
        <HashSum>aBcDeFgH12345XyZ</HashSum>
        <HashSumAlgorithmType>SHA1</HashSumAlgorithmType>
      </HashSum>
    </File>
  </TechnicalImageDetails>
  <TerritoryCode>Worldwide</TerritoryCode>
</ImageDetailsByTerritory>

while the valid XML is:

<ImageDetailsByTerritory>
  <TerritoryCode>Worldwide</TerritoryCode>
  <TechnicalImageDetails>
    <TechnicalResourceDetailsReference>T12345</TechnicalResourceDetailsReference>
    <ImageCodecType>JPEG</ImageCodecType>
    <ImageHeight>800</ImageHeight>
    <ImageWidth>800</ImageWidth>
    <File>
      <FileName>ABCDEF_T12345.jpg</FileName>
      <FilePath>resources/</FilePath>
      <HashSum>
        <HashSum>aBcDeFgH12345XyZ</HashSum>
        <HashSumAlgorithmType>SHA1</HashSumAlgorithmType>
      </HashSum>
    </File>
  </TechnicalImageDetails>
</ImageDetailsByTerritory>

The root of the issue is in classes that inherit from other classes that define xml_accessors like:

# file: "lib/ddex/v20100712/ddexc/display_artist.rb"
class DDEX::V20100712::DDEXC::DisplayArtist < DDEX::V20100712::DDEXC::PartyDescriptor
  ...
  xml_accessor :artist_roles, :as => [DDEX::V20100712::DDEXC::ArtistRole], :from => "ArtistRole", :required => false
  ...
end

# file: " lib/ddex/v20100712/ddexc/party_descriptor.rb"
class DDEX::V20100712::DDEXC::PartyDescriptor < Element
  ...
  xml_accessor :party_names, :as => [DDEX::V20100712::DDEXC::PartyName], :from => "PartyName", :required => false
  ...
end

# file: "lib/ddex/ern/v32/image_details_by_territory.rb"
class DDEX::ERN::V32::ImageDetailsByTerritory < DDEX::V20100712::DDEXC::ImageDetailsByTerritory
  ...
  xml_accessor :technical_image_details, :as => [DDEX::ERN::V32::TechnicalImageDetails], :from => "TechnicalImageDetails", :required => false
  ...
end

# file: "lib/ddex/v20100712/ddexc/image_details_by_territory.rb"
class DDEX::V20100712::DDEXC::ImageDetailsByTerritory < Element
  ...
  xml_accessor :excluded_territory_codes, :as => [], :from => "ExcludedTerritoryCode", :required => false
  xml_accessor :territory_codes, :as => [], :from => "TerritoryCode", :required => false
  ...
end

XML elements defined in the parent classes are written in the XML output after the elements defined in the child class, however this is not what the XSD defines.

I'm not quite sure how to fix this issue. The only idea I have right now is define each XML element in a single class without inheritance from other XML elements. Not an small change.

ancorcruz commented 5 years ago

@sshaw Do you have any ideas on how to tackle this issue?

sshaw commented 5 years ago

Hi, invalid XML. No good. Though after all this time this makes me think that no one is validating.

Anyways, this is an issue with ROXML but, adding this to DDEX::Element may fix things:

    def self.roxml_attrs
      super.reverse
    end

    def roxml_references
      super.reverse
    end

Let me know if that works for you. If so, I will create a release with it and open an issue or PR in ROXML.

ancorcruz commented 5 years ago

I have been using the gem for parsing and processing the DDEX ERN/ECHO files for a while. However, I'm half way with XML files creation for the outbound feed now and found this issue. Are you writing DDEX XML files with the gem?

Your suggested change flips the order of everything, it might require some tweaking. I'll check with the aggregator if they accepts the generated XML as valid before further investigation.

Thanks.