sshaw / ddex

DDEX metadata serialization for Ruby
https://metadatagui.com
52 stars 41 forks source link

Can't Open xml document #1

Closed germs12 closed 10 years ago

germs12 commented 10 years ago

Hi,

I'm working on parsing through DDEX files and found your gem. It doesn't appear to work as the README.md states, but there is a ton of code and the majority of the tests pass. I would love to contribute to this if I could, but I can't seem to get going with the gem. Is there something I need to do to make it work? Thanks for you assistance, I hope I can help you progress this gem forward.

sshaw commented 10 years ago

What DDEX spec and version are you working with? Most of this work is for the ERN v3.4. If you're dealing with ERN, what type of metadata are you trying to parse (music, video, software, ...)?

I think it could be ready to deal with ERN music metadata with a small amount of additional work, but this would be in a very basic manner (pre release), as there are some important underlying issues that need to be addressed. At this point I've put these off just to get through the grunt work of creating the class mappings (though they should be addressed now otherwise additional work could be counterproductive). They are:

DDEX versions

By this I mean different ERN versions (though support for all specs is the goal). Does one create a different namespace for each version?

doc = DDEX.read("metadata-4.0.xml")  # returns DDEX::ERN::V40::NewReleaseMessage
doc = DDEX.read("metadata-3.2.xml")  # returns DDEX::ERN::V32::NewReleaseMessage

Or just create a mapping that's capable of handling all versions in a single class?

Turning objects into valid XML

DDEX uses XML Schema and its sequence directive to impose an order on the elements. How does one translate this order into the XML mapping? I'm using ROXML, which defaults to generating elements in the order in which they're declared. This is fine, but limits code reuse (or at least the approach to reusing code).

For example, ern:SoundRecording differs from ddexC:SoundRecording by 3 elements and 1 attribute. But if one makes ern:SoundRecording a subclass of ddexC:SoundRecording (which is currently the case) to_xml could generate invalid XML as there's no guarantee that the schema will allow base class elements to come before elements in their subclasses. This is also a potential downside to the namespace by version approach.

In this case I think that instead of using modules/classes to define behavior and relationships one would have to use modules to define helper methods that were then called in XSD sequence order when mapping a class:

class SoundRecoding < Element
  include DDEX::SoundRecording::Accessors

  xml_accessor :x  
  common_element_1
  common_element_2
  xml_accessor :y
end

Or, maybe it's better to define the XSD sequence order separately:

class SoundRecording
  ELEMENT_ORDER = %w[x y z]

  def to_xml
    e = super
    # need to account for Nokogiri and libxml elements
    e.sort_by { |node| ELEMENT_ORDER.index(node.name) }
    # etc...
  end
end

So yeah, that's where I'm at. Help is definitely appreciated.

sshaw commented 10 years ago

Closing this as things are somewhat more stable (still not production ready, though).

I still haven't generated and added all the DDEX specs to the lib, but doing so should now be trivial (see jaxb2ruby and the generate Rake task). Let me know if you're using something that's not ERN and I'll try to add it.

germs12 commented 10 years ago

@sshaw Thanks for your reply. Sorry to have gone radio silent but I am back on the DDEX bandwagon and should be working with your gem more frequently moving forward. I submitted a PR for an issue I encountered. Let me know if there is a flow you'd like me to follow. Thanks!