microsoft / GraphEngine

Microsoft Graph Engine
MIT License
2.2k stars 329 forks source link

Trinity.TSL: consider overriding object.Equals and object.GetHashCode #59

Open yatli opened 7 years ago

yatli commented 7 years ago

When we first build the codegen, we provided overridden implementations for == and !=, but not Equals and GetHashCode. This brings trouble for expressions like if (a == null) (we have to write if ((object)a == null). The compiler gives warnings about this. This makes it easy for us to add the missing functions once we migrate to the new template-based codegen, as we can have the compiler warnings available to us even at the template stage. However, on the other hand, we have to make sure that this is not a breaking change, for that overriding these two function may change the behavior of the code.

TaviTruman commented 7 years ago

@yatli I think this will break my RDF API sitting atop of the GE generated code; as structural type checking and type-value checking is critical behavior changes get especially chancy with optional and dynamically injected Generic cells/fields in the TSL model. I am using typeof() and default() to discover, verify and initialize types projected onto the GE representation (RDF/RDFS/XML to TSL) of the RDF graph. I'll have to test more but does this change adversely effect something like Trinity.RDF? In my Materializer I'm using the generation of Generic Cells and injecting the type into a "Optional" List types - List of CellId, List of Type [Metadata], e.g. [GraphEdge] populated. I am using these methods quite a bit:

ICell.SelectFields() ICell.EnumerateValues()

I have to implement a custom IComparable for equality test so the behavior of object.equals and object.GetHashCode is critical.

yatli commented 7 years ago

@TaviTruman Currently the overridden == and != operators compare not only reference equality, but also the cell content if both accessors are valid (for cell structs it compares all the fields). But since we haven't overridden object.Equals, the behavior defaults to "reference equality" for accessors, "field-by-field equality" for the corresponding structs. Likewise for object.GetHashCode, accessors get "reference hashcode" and structs get "field-by-field hash code aggregation".

But if you are using ICell it would be a different story, because == on interfaces always means "reference equality", but Equals calls directly into the implementation, which defaults to "field-by-field equality" for codegen'ed structs, but "reference equality" if you implement a class adapter the implements ICell.

So I think we should definitely investigate and discuss more on this topic before we make a conclusion on how to approach it. :)

TaviTruman commented 7 years ago

Hello @yatli - that is exactly what I thought. This is an important aspect for me as TSL Schema like knowledge (RDFS/OWL) and taxon (SKOS) object information is gotten at run-time and the shape of the knowledge, information and data is not known ahead of time - this is mostly due to Ontology-based inferences materialized by our reasoners. I do use ICell generics a lot so as to model data that must be introduced into GE but knowledge and such schemata exist outside of the design-time TSL and resultant code gen. So part of what need to happen via Graph Exploration works well and reference equality is not an issue; however, as you have stated difficult with code-generated class that implement ICell. This should be a lot of fun :-)