F2I-Consulting / fesapi

API for ENERGISTICS™ data standards (mainly RESQML™), multi-languages (C++, Java, C#, Python)
Apache License 2.0
34 stars 24 forks source link

reading problem with energistics datasets #186

Closed untereiner closed 5 years ago

untereiner commented 5 years ago

What are the steps to reproduce this issue?

Loading data

What does happen?

When loading some ftp energistics data (for example /upload/testingV2_0/total/alwynStep1) I don't have the relations between objects: for examples {Tectonic/Genetic}Features don't have FeatureInterpretations and FeatureIntepretations don't have Representations.

What were you expecting to happen?

should have relations

Any logs, error output, etc?

The content type application/x-resqml+xml;version=2.0;type=obj_EpcExternalPartReference should belong to eml and not to resqml since obj_EpcExternalPartReference is part of COMMON and not part of RESQML.

Any other comments?

What versions of fesapi are you using?

master branch

philippeVerney commented 5 years ago

Hi Lionel,

At a first look, it looks (again) as a problem in the file, not in latest fesapi. Fesapi just now conforms better to the standard than in the past. If you open an XML of a fault interpretation, for example obj_FaultInterpretation_9fb7c7ea-23cf-4e1a-9cfa-059f74cb47f5.xml, you will notice that, line 16, a particular version is required for the referenced feature of this interpretation. However, if you open the corresponding feature obj_TectonicBoundaryFeature_e1a5c332-9424-47b6-825f-3f6b35076c2c.xml, you will notice that the version is missing.

From a standard point of view, it means that you request a particular version of the feature which does not exist in this EPC document.

The error is clearly in the file on the FTP server. This file should be fixed. I am not aware if it is easy or not (I don't remember where are the original data). I don't even remember why we do reference a particular version????????

I have to think about if I have to make a workaround or not. Keeping a 0.16 version tends to me to be a better solution if you really really need to rely on this buggy file.

untereiner commented 5 years ago

It is possible that a large majority of the files of the ftp are then broken (I tried a lot of them)

philippeVerney commented 5 years ago

Yes all EPC documents produced by a fesapi version older than https://github.com/F2I-Consulting/fesapi/pull/67 (older than v0.12.0.0) should have this bug. It has been fixed in this PR in AbstractObject.cpp lines 571-574.

philippeVerney commented 5 years ago

Do you just want some similar data to test? I should be able to provide some fixed EPC document on the Energitics FTP. Or do you need exactly these particular data?

untereiner commented 5 years ago

I have multiple datasets (alwyn and others) modified by JFR with the explorer/validator that I need to load. Is there a way to load from the old style and save in the new style ?

philippeVerney commented 5 years ago

Probably OK if you deserialize and serialize with fesapi version 0.12 to 0.16. I would recommend to try with 0.16. The HDF5 file should not be modified at all by this operation.

The bug on loading has only been fixed in 1.0.0.0 which try to take into account "version" at loading time. The bug on writing has been fixed in 0.12. If you have not enough time, I'll work on that but the priority will be low on my side (except if people who rules my time decides differently).

philippeVerney commented 5 years ago

You may also be able to use Geosiris' editor to remove the versionString from all DORs (script or java program). But you know this possibility probably better than I do.

Down the road, the fix is simply to remove all lines containing VersionString from all xmls of the EPC. An unzip plus "search and replace" on all xml files with notepad++ should do the work efficiently as well. I am sure that some Unix shell command as well.

untereiner commented 5 years ago

Ha, I did not understand. Loading works if all entities have a VersionString or none of them? I though this tag was mandatory now.

philippeVerney commented 5 years ago

in 1.0.0.0 loading works if the referenced entity (for example the feature of an interp) corresponds exactly to what is written in the DOR (DataObjetRederence). In a DOR, you can optionally ask for a particular version of an entity but you are not forced to. In an entity, you can also optionally indicate which particular version of this entity you have.

Generally, nobody uses version so versionString should not appear in DOR nor in entity (the attribute is called objectVersion in an entity). If you put a versionString in a DOR, then you should have an entity somewhere with the corresponding objectVersion.

fesapi pre v0.12.0.0 exported a versionString in all DORs but did not export the corresponding objectVersion in the corresponding entities. This mismatch now fails in 1.0.0.0

If you just remove all versionString from all DORs, then you would reference an entity without "objectVersion" which is exactly what you already have.

The tag is still optional even in 1.0.0.0 because it is optional in the standard.

untereiner commented 5 years ago

To let you know I wrote a python script that erases the versionStrings in an epc file.

philippeVerney commented 5 years ago

Feel very free to upload the files you corrected on the FTP server if you fixed them. But, as I understood, there are other changes (re "modified by JFR with the explorer/validator") so I guess you cannot. I will try to deprecate these files on the FTP server when I'll find time.