Informatievlaanderen / VSDS-Linked-Data-Interactions

https://informatievlaanderen.github.io/VSDS-Linked-Data-Interactions/
European Union Public License 1.2
4 stars 6 forks source link

Specify the usage of CreateVersionObjectProcessor #25

Closed xdxxxdx closed 1 year ago

xdxxxdx commented 1 year ago

According to the specification at the https://github.com/Informatievlaanderen/VSDS-LDESWorkbench-NiFi#create-version-object

To support the creation of version objects, e.g. when transforming data in the [NGSI LD format](https://vloca-kennishub.vlaanderen.be/NGSI_(LD)) to LDES.

From what I understand, the create-version-object processor should be able to create a version for not only NGSI-LD data type? If I am wrong, please correct me.

But in the Nifi, the short explanation of this processor. It says image Converts NGSI-LD to LdesMembers and sends them to the next processor I am confused, is it only for NGSI-LD or not?

Also, I am confused about the configuration.

e.g. I have a dataset in JSONLD format. KBO.zip(Which data formats are acceptable for this processor?)

I wanna create versions on the `https://kbopub.economie.fgov.be/kbo#Enterprise'. The id of this Entity is https://kbopub.economie.fgov.be/kbo#0416822559.

I would like to combine the ID with the generated time to have version control of the entity. What should I fill in for the required properties to use the processor? image

XD

rorlic commented 1 year ago

In general

The processor (and underlying core implementation) started out as a NGSI-LD state to NGSI-LD version object creator. At this point in time, the functionality is not limited to NGSI-LD but to any linked data (LD) object. The core functionality of this component is to create a version LD object from a state LD object. We need to fix the documentation to clearly state this.

At this moment, the component expects a linked data object formatted as JSON-LD as input and will produce output formatted as given by the Data destination format property. Therefore some properties expect a JSON path to determine what value to use.

To convert a state object with ID <id> to a version object with ID <id-delimiter-timestamp>, the component does the following:

The VersionOf Property is used to refer from the newly created version object to the original state object with <id> (basically creating a triple <id-delimiter-timestamp> <versionof-property-value> <id>).

The GeneratedAtTime property is used to add a property indicating the timestamp value in a standard way (basically creating a triple <id-delimiter-timestamp> <generatedAtTime-property-value> <timestamp>).

Your use case

The CreateVersionObjectProcessor should be configured as follows (NiFi properties, top to bottom):

Please keep the other properties as configured in the screenshot above.

@xdxxxdx I will try the above myself and can share my solution with you, if needed.

rorlic commented 1 year ago

Attached you will find two solutions for your use case:

There seems to be an issue with the NiFi processor: the isVersionOf and generatedAtTime properties are not attached to the version object. However, this processor is in the process of being refactored into the repository of the LDTO where this behavior is corrected as shown by the LDTO workflow. The LDTO repository is still work in progress and needs better documentation. We are aware of that.

xdxxxdx commented 1 year ago

Hello @Yalz ,

Thanks for reaching out for the feedback. I tried the solution provided by Ranko, it works for the KBO dataset.

but as you can see, what I was asking is: Specify the usage of CreateVersionObjectProcessor.

Not only for the KBO dataset but for all the users who want to use this processor, we could provide a clear configuration way to let them set the suitable configuration for their dataset.

E.g. https://github.com/Informatievlaanderen/VSDS-Linked-Data-Interactions/blob/main/ldi-core/README.md The create version processor indicates itself is made for NGSI LD? According to the specification from Ranko, this processor should work for all linked data formats. ?

I think the situation now is: If create version processor should work for all linked data. Then we should keep it open and document how it works for all linked data Or create version processor is only for NGSI LD, then we close it with document saying this processor is made for created version based LDES entity with generateAtTime ontology value only for NGSI LD dataset

Thanks Xueying