Closed martonnovak closed 4 years ago
Hi Martin,
Congratulations on getting the example running correctly.
Now firstly given that this information will be helping you load data into the Analysis Repository and not the Information Store, be really sure that this is what you would like to do. It may be worth reaching out to your IBM representative to see if a different approach could also match your needs.
If you decide that you need to continue, you need to map the external representations in your data to an i2 schema, which you can create and modify using schema designer.
A valid xsd file can be created via the toolkit:
On the i2 Analyze server, open a command prompt and navigate to the scripts directory of the i2 Analyze deployment toolkit.
Run the following command to generate a .jar file that contains the XML representation:
setup -t generateMappingJar -x _i2analyze_schema_ -o definition_jar
Here, _i2analyzeschema is the full name (including the path) of the schema file, and _definitionjar is the full name (including the path) of the output .jar file.
Warning
If _definitionjar specifies an existing file, the generateMappingJar
task does not overwrite it. If you run the same command twice in succession, ensure that you move or delete the output between calls.
Inside the .jar file, the name of the XML file that you need is schema4.xsd.
Hope this helps
Esther
Thank you for your fast reply!
Although we still couldn't find out how does the underlying methods transform the data to the correct format. On the example, one can see that there are Actors and Actor entities in the data1.xml file. Can you provide or explain us the function/method that transforms these entities to the existing items defined in the mentioned analyze schema? (We are using the law-enforcement schema) So the way that Actor is converted to Person
Thanks in advance! Marton
Hi Marton
As you have probably seen, the onyx-da-arload-filesystem-example example project uses common code from another project that you load into your IDE called onyx-da-example-common.
If you follow the “load” method from ExampleDataLoaderMain it calls exampleDataLoader.load.
This load method calls createTransformedXmlSource with a variable that holds the name of the file which contains the data to load. In this case, as you have already seen, it is a simple file called data1.xml. (On purpose, the data in this file is not in the same shape as the schema we are using so that we can show one way of converting it to what we need). This method then calls out to a method in …\SDK\sdk-projects\onyx-da-example-common\src\main\java\com\example\ExampleXmlTransformer.java and returns XML as we want it. (I will describe this below).
In order for you to be able to load data into the AR you have to give us it as XML in a format that we understand.
When you ran the generateMappingJar for your Analysis Repository Schema it created a set of compiled java classes that know how to map from XML that matches a simplified representation of the entities and links in that schema to our internal types. So that you do not have to guess what this XML should look like, we also create the XSD’s needed to define what we are expecting from you and as Esther has mentioned earlier, schema4.xsd holds all that you need from your perspective.
You can decide how you want to do this, we do not mind, all we require is that you give us XML that we understand.
Going back to the specific example you have been running: Our example uses an external data input file which is XML in a different shape than we need, as I have mentioned above. We have to convert this XML into our own XML format so that we can then build the correct internal classes for the schema.
The simplest way for us to change the external XML into our XML format is by using XSLT. (This is just a way to detect strings in input XML, for example ‘Actor’, and write different XML as a result, for example 'Person'.
If you look in …\SDK\sdk-projects\onyx-da-example-common\fragment\WEB-INF\classes\dataToI2analyze.xslt you can see it is written to do exactly that, and will only work for a very specific input and creates a very specific output.
If you are doing a Proof of Concept where you want to ingest data from an XML file that the client gives you, then you could just alter this XSLT file so that it matches the XML they give you and can then be used to convert it.
We have also included an example wrapper class \SDK\sdk-projects\onyx-da-example\common\src\main\java\com\example\ExampleXmlTransformer.java as mentioned above, with a transformSourceSystemXml method that takes two parameters, the incoming XML data file and the XSLT file to use in transforming it. This returns XML in the format we want that matches the XSD specification.
We do not expect that you would make a production system using the classes as they are, but hope that they give you an understanding of the basic mechanism that you can change and build upon.
As you can see, input and output in this example are tied together by the XSLT file so any change you make to the input data or the i2 Analyze schema to entity or link types you wish to map to would require this to be altered.
If you change your Schema for i2Analyze you need to re-generate the mapping jar and that in turn will create new mapping classes and XSD’s.
Using an XSLT mapping mechanism is just one possible way to do this. If you are happier working with java then you can achieve the same thing by using its XML parsing functionality to read from the file and then create annotated Java classes that know how to write out XML in the correct format for the XSDs and our code.
If your input is from a database or from data structures that are very different than you might want to map to in the i2Analyze schema then you also have to decide whether you want to do all of the mapping and processing in one place. It might be better to use an intermediate format that you store your incoming data in that breaks the task down into simpler stages. (This also helps you to manage changes to either side of that intermediate format).
Depending on how much effort you want to put into it, and whether you think that things will change, you can always make this more dynamic so that your code changes as the input or output does, but you have to weigh up the cost of the extra development compared to the benefit it gives in your particular scenario.
Cheers
@TonyJon Thanks again for your fast response, unfortunately some other things came up for us.
We now understand how xmls are translated. However when we tried to simply upload a custom xml, it fails. The xml contains only one person (actor), and corresponding security tags. We found in the bin folder that the default tags are:
< SecurityTagIds> < SecurityTagId>UC< /SecurityTagId> < SecurityTagId>OSI< /SecurityTagId> < /SecurityTagIds>
We have tried to add these tags to our custom xml, simply with the
One more thing: if we'd like to associate a person to an organization. How can we modify our xml to make the associations different? In the Intelligence Portal there are several different options as: Member Of, Involved In, etc... In the examples we only saw the simple associations between Actors.
Thanks in advance!
Hi Marton
In our example, we decided that we would have the same security tags added for all of the entitles and links. This means that we did not have to parse and translate incoming values from the data file, we just had to add the same static values to each entity or link using our XSLT. This is why when you looked in data1.xml there were no values that you could find.
For your information, if you look in dataToi2analyze.xslt in the same folder as the data file you can see we have this element in our itemTemplate section.
<SecurityTagIds>
<SecurityTagId>UC</SecurityTagId>
<SecurityTagId>OSI</SecurityTagId>
</SecurityTagIds>
This is simply the format required to add default dimension values to each item we parse ( entity or link) into our XML format.
If you are using the default example-dynamic-security-schema.xml that we provide then you will see that these two values match a security dimension value from each of our two default security dimensions.
Thank you very much for your answers!
After trying out the data load direct example I would like to create my own .xsd files and use them to upload custom entities to the database. In the given xsds they used tags and it got translated to Person items in Onyx.
My question would be how can I create my own custom schema, and how can I map it to an already existing entity type in Onyx.
For example I would like to define a new car type and if I'd make a sound and valid xsd file with a couple of car entities, the data load script could map these entities to the vehicle type entities in the system. The schema base / creation and the mapping are the key questions.