CIM 100 - Githubissues

derrickoswald commented 7 years ago

The CIMreader is currently coded for CIM 17 or rather one specific combination of CIM17 (iec61970cim17v34_iec61968cim13v12_iec62325cim03v17a.eap) that has been labeled CIM100.

For generality, the CIMReader needs to be able to:

allow for two or more different sets of CIM model classes by version in the source code repository
identify one set as the default version (softlink?)
allow for static compilation of client code against a specific model
identify the model version to be used on import using the CIM header namespace, which maybe needs a heuristic based on namespaces seen in the wild, e.g.:
- xmlns:cim="http://iec.ch/TC57/2016/CIM-schema-cim17#"
- xmlns:cim="http\://iec.ch/TC57/2013/CIM-schema-cim16#" ← our original namespace
- xmlns:cim="http://iec.ch/TC57/2012/CIM-schema-cim17#"
- xmlns:cim="http://iec.ch/TC57/CIM100#"
allow for dynamic programmatic selection of model version classes based on the RDF header (perhaps altering the Java classpath so that client code need not be concerned with a specific version, i.e. import cim17.model vs. import cim16.model) unless they use new or changed classes
allow for older models (CIM16, CIM15, CIM14) in the same jar file
allow for conversion between versions, either upgrade or downgrade

derrickoswald commented 5 years ago

The CIM100 model has been created from CIMTool and committed to the master branch, so it is now the de-facto default.

derrickoswald commented 4 years ago

One way to provide various CIM version models (where model == Java jar compiled from Scala source files) is to create a Maven package for each version.

So, the Maven coordinates for CIM100 might be:

ch.ninecode.cim.model:CIM100:2.11-2.4.5-4.1.4

(Note: the package was previously called simply ch.ninecode.model)

Ignoring the "conversion between versions" use case, the model to be used by the CIMReader, and other programs such as CIMExport, could be specified on the command line for spark-shell and spark-submit with the --packages ch.ninecode.cim.model:CIM100:2.11-2.4.5-4.1.4 option.

If none is specified, it could be classed as an error, but then the user would need to know maven coordinates and the version of the file by looking at the CIM namespace within the file.

A better approach would be to have some sort of default, or better yet, to examine the namespace in the header of the file (or files) and use a heuristic mentioned in this issue description to load the correct jar.

Now, where to get the jar? One approach is to hard code some locations, just like Spark does, but to use the specific jar location. For example:

https://repo1.maven.org/maven2/ch/ninecode/cim/model/CIM100/2.11-2.4.5-4.1.4/CIM100-2.11-2.4.5-4.1.4.jar https://dl.bintray.com/spark-packages/maven/ch/ninecode/cim/model/CIM100/2.11-2.4.5-4.1.4/CIM100-2.11-2.4.5-4.1.4.jar

derrickoswald commented 4 years ago

The CIMReader can (I think) fairly easily handle underlying class changes if the CHIM.scala package delegates the ClassInfo list (CHIM.classes) to the specific module (something like a ClassList object) - assuming that Element, BasicElement and Unknown classes are retained in the CIMReader since it uses the abstract methods on Element. Obviously, to avoid circular references, it can depend on none of the specific version of CIM classes modules.

The preprocessors (CIMAbout, CIMNormalize, CIMDeDup) only depend on Element, so they are CIM version independent by definition.

The difficulty then would be to produce CIM version independent post processors (CIMEdges, CIMNetworkTopologyProcessor, CIMJoin) that are pretty dependent on a specific version of the CIM classes. CIMExport as a standalone module also depends on a specific version of the CIM classes.

It would be easier if the CIM classes didn't need to extend Row which is Spark version dependent. I wonder if it's possible to get rid of that requirement.

derrickoswald / CIMSpark

CIM 100 #8