samvera / active_fedora

A Rails interface to the Fedora repository, akin to ActiveModel
Other
54 stars 63 forks source link

Mass assignment, Computed properties and Complex properties in RDF #75

Closed flyingzumwalt closed 11 years ago

flyingzumwalt commented 11 years ago

This ticket is a placeholder for UCSD work improving support for complex RDF content.

Mass assignment of Nodes from a Hash

(ie. update_attributes) All RDF properties accept hash arguments, allowing sub-hashes to cascade to children

params = {
        myResource: {
          topic: "Cosmology",
          temporal: "16th Century",
          personalName: {
            fullName: "Jefferson, Thomas",
            dateName: "1743-1826" 
          },
          personalName: {
            fullName: "Jefferson, Thomas",
            dateName: "1743-1826" 
          },
          corporateName: {
            name: "University of California, San Diego.",
            name: "University Library",  
          },
          complexSubject: {
            personalName: {
              fullName: "Jeffersion, Thomas",
              dateName: "1743-1826" 
            },
            topic: "Presidency", 
            temporal: "1801-1809" 
          },  

        }
      }
      @resource = DamsResource.new()
      @resource.attributes = (params[:myResource])

Note: NOT implementing update_attributes because, due to the nature of RDF graphs, we cannot support updating existing nodes. You would need a full RDF graph in that case (predicates, URIs, etc.). This work is only supporting mass-setting of the graph and mass updating the graph from Hashes of properties and values (as opposted to URIs, predicates, and values).

Computed properties (postponed)

\ Not implementing computed properties at this time. **

ie.value of mads:authoritativeLabel element is computed based on the other stuff in the parent node.

Formula for specifying computed fields:

flyingzumwalt commented 11 years ago

Code Review: Issues to Discuss

Behavior of TermProxy, .nodeset, array accessors, etc.

The changes in this commit aspire to match the behavior of OM dynamic nodes, but there is some dispute about what the actual behavior is/should be. The difference impact deployed code in Sufia and Bawstun, but it's not clear which code should be refactored. It mainly revolves around which methods should return Proxy/Dynamic objects and what should be returned by certain methods called on those Proxy/Dynamic objects. There are also some inconsistencies around methods with similar names and behaviors (ie. .value, .val, .values)

Q: How should each of these behave? What should they return?

@ds.personalName @ds.personalName[3] @ds.personalName(3) @ds.personalName.first @ds.personalName.last

Q: For each of those, what happens when you call

.value (implemented on RDFNode) .values (implemented on RDFNode::TermProxy) .val (implemented on OM::DynamicNode, not in AF::RDFNode) == .class .nodeset .proxy ???

Q: For each of those, what happens when you call it with the name of a property defined on the corresponding node?

ie.

@ds.personalName(3).elementList
@ds.personalName[2].elementList
@ds.personalName.last.elementList

Commits: *https://github.com/projecthydra/active_fedora/commit/a8299fdc06f8e9d54d9b36ec3032dbf1b23152e6

Insertion Operator <<

Q: Is this ok as it's implemented?

Complex Properties

Does any of this belong in ActiveFedora? Would it belong in an rdf-node gem or a complex-rdf-node gem?

retrive_node and retrieve_values

Accept an array of properties that lets you traverse the graph to retrieve a node or its value.

Q: Does this belong in AF? It's very similar to OM behavior.

default_write_point_for_values, value= & value

Allows you to Read+Write to/from default_write_point_for_values. This makes it possible for complex properties lie MADS::Topic to behave in the API as if they were simple RDF assertions.

Q: Should this be pushed to a separate gem that extends RDF::Node?

Commits

Sub-Topic: Defaulting to use of RDF.value when nothing specified

Q: Does it make sense for Nodes with a defined class to use RDF.value predicate to assert their value when no default value write point has been declared on the Node class? If so, what is the best place to inject the :value property into the class config?

Had to hack RDF module to allow you to use the RDF.value predicate. It seemed like a bug in rdf.rb the we weren't able to use that predicate. Is there a better workaround?

# Ugly hack to allow defining of properties in the RDF vocabulary's namespace (ie. RDF.type, RDF.value) within map_properties
module RDF
  # This enables RDF to respond_to? :value so you can make assertions with http://www.w3.org/1999/02/22-rdf-syntax-ns#value
  def self.value 
    self[:value]
  end
end

Set Graph or Insert to Graph from Hashes using attributes= and '<<'

Q: Should I be able to do this:

See https://github.com/projecthydra/active_fedora/blob/complex_rdf/spec/unit/rdf_complex_properties_spec.rb#L5-L118 for full Class tree that this relies on/assumes.

values_hash = {
          complexSubject: [{
            componentList: {
              personalName: {
                elementList: {
                  fullNameElement: "Callender, James T.",
                  dateNameElement: "1805" 
                }
              },
              topic: "Presidency", 
              temporal: "1801-1809"
            },
          },
          {
            componentList: {
              personalName: {
                elementList: {
                  fullNameElement: "Hemings, Sally",
                  dateNameElement: "1802" 
                }
              },
              topic: "Slavery"
            }
          }]
        }
        @ds.attributes = values_hash

Q: Does this functionality belong in ActiveFedora?

Applying Rails conventions to mass-assignment

Q: Would it be better/necessary to apply the Rails convention around delegating attributes across associations?

jcoyne commented 11 years ago
@ds.personalName
 #=>  [<DummyMADS::PersonalName:0x007f96d8a193b8 @graph=#<RDF::Repository:0x3fcb6c5556f0()>,
          "Hemings, Sally"]

          @ds.personalName.should == [<DummyMADS::ElementList:0x007f96d8a193b8 @graph=#<RDF::Repository:0x3fcb6c5556f0()>>, "Hemings, Sally"]

          @ds.personalName(0).elementList(0).fullNameElement.should == ["Jefferson, Thomas"]
          @ds.personalName(0).elementList(0).dateNameElement.should == ["1743-1826"]
          @ds.personalName(1).should == ["Hemings, Sally"]

Basically what you put in, is what you get back out. If personalName[0] is pointing at a ElementList node, then return that node. If personalName[1] is pointing at a string, return that. This does mean that the user needs to check to see if the result is a literal before trying to call a method (like elementList), but this wouldn't be the case if they used a consistent schema.

A: I think this does belong in active-fedora, but I'm also of the opinion that OM belongs in active-fedora too. I only know of one person who is using OM outside of the context of active-fedora.

A: I think what you are doing is pretty good. I've also stumbled onto that problem before.

A: yes, we ought to have this, I'd like the RDFObject to support acceptes_nested_attributes_for like ActiveRecord: http://api.rubyonrails.org/classes/ActiveRecord/NestedAttributes/ClassMethods.html

jcoyne commented 11 years ago

Here's my take on how we can support accepts_nested_attributes_for:

https://github.com/projecthydra/active_fedora/compare/rdf_nested_attributes

This approach mimics how ActiveRecord handels nested attributes and would be compliant with the fields_for view helper. http://apidock.com/rails/ActionView/Helpers/FormHelper/fields_for

I say that we leave out the expansion of terms (e.g. default_write_point_for_values), or at least encapsulate that into it's own module. I think that is a very small use case and I'd rather not have it in ActiveFedora if we don't need it. I think it makes the code more difficult to read due to the added indirection. It's not clear that when I update ds.personalName = "Hemings, Sally" that it actually created a new node at personalName.elementList.fullNameElement, especially since it would be legal in rdf to actually just put a string at ds.personalName. The mechanism for doing the transformation between the hash coming from the form and the hash sent to attributes= may even belong in the controller.

flyingzumwalt commented 11 years ago

Decision: use nested_attributes approach and move recursive-expansion to .build method

Based on discussions with @jcoyne I'll be using his rdf_nested_attributes approach. Regarding expansion of terms, (ie. when you pass a string to set a property with class_name in its convig), the set_values method (and therefore the = operator) will return an ArgumentError but, as an alternative, the .build method on all properties will accept an optional value parameter that can be a Node, String, or Hash.

In other words, the recursive-expansion behavior that's currently implemented in the complex_rdf branch on set_values will instead be available on the build method. This allows for clarity when using the set_values and the assignment operator. What you set as a property's value using =, the value you provided is what you will get back; it's never interpreted or transformed into something else. If the value you provided is something that isn't valid for insertion into the graph, you get an error.

mjgiarlo commented 11 years ago

Related: A few weeks ago I started a companion to the "Tame your XML with OM" guide that focuses on working with RDF metadata in Hydra:

https://github.com/projecthydra/active_fedora/wiki/Tame-your-rdf-metadata-with-activefedora

Those pages might be a good place for any documentation that comes out of this work you're doing on RdfNode and the more complex assertions, @flyingzumwalt.