Closed luis100 closed 5 years ago
Phillip point out that it is reasonable to put the gml into separate files (and add links to siard) not directly into the siard.
Pros:
Cons:
What are your thoughts? Should we put all the gml-s into one folder. Then it is easier to transfer the files to QKIS for example?
I talked with our danish collegues and made some experiments with our own maps.
DBPTK should produce two versions of gml-data:
Our institutions use mostly estonian specific coordinate system (EPSG:3301 - Estonian Coordinate System of 1997), but it is useless for GPS devices or google maps. During the conversion it should convert the geometry to WGS84 (that is most used coordinate system).
So it seems that before the conversion user should specify two things in some dialog: current coordinate system and the limit for the big GML(s).
From my experience, it is best if DBPTK aims to save information in the most simple and documented way. This has a lot of advantages from a software development point of view.
I agree with Phillip:
Phillip point out that it is reasonable to put the gml into separate files (and add links to siard) not directly into the siard.
And the only "con":
- you can not search the coordinate via DBVTK (but I am not sure how much is this search needed anyway).
Is something that can be handled specifically for DBVTK and GML files if and when it is necessary.
The GMLs should be saved like other lobs, with the appropriate file extension and some documentation on the SIARD about it being a GML file.
I propose saving the small gml files for each table row. If a "source coordinate system" parameter is provided, then a coordinate system conversion will be done to (for example) WGS84.
An optional parameter that limits the GML file size can also be implemented. If DBPTK is about to create a GML file bigger than that, everything fails and the whole migration is aborted. This is ok since this is an optional parameter and by default DBPTK would save GML of any size.
To create the big gml file for the entire table (which seems to be a requirement, let me know if it is not) it would be better to do one of these:
Exported SDO_GEOMETRY cells are converted to a xml geometry that can be saved in the SIARD file as text. Multiple xml geometries can be combined into a .gml file.
When exporting with -egml <directory>
(SIARD2 parameter), a .gml file will be created for each table containing SDO_GEOMETRY cells. This file will include all the geometries present in said table, creating a "feature" for each row which includes the geometry, and uses the rest of the columns as metadata.
dev notes:
using -egml
injects a FilterModule that handles saving data in the gml file and forwards the cell values (without changes) to the siard export module.
I tested this with one of our institutions database which has SDO_GEOMETRY present.
I got following result in metadata.xml
<column>
<name>GEOMEETRIA</name>
<lobFolder>lob7</lobFolder>
<type>CHARACTER LARGE OBJECT</type>
<typeOriginal>SDO_GEOMETRY</typeOriginal>
<nullable>true</nullable>
</column>
But where is this lob folder lob7
? It is not under content.
And in Content -> table1.xml I have
<row><c1>*****</c1><c2>2017</c2><c3>******</c3><c4>0.394</c4><c5>Põllusaar</c5><c6>2016.11.02 08:46:32</c6><c7><gml:Polygon>
\u0020\u0020<gml:outerBoundaryIs>
\u0020\u0020\u0020\u0020<gml:LinearRing>
\u0020\u0020\u0020\u0020\u0020\u0020<gml:coordinates>
\u0020\u0020\u0020\u0020\u0020\u0020</gml:coordinates>
\u0020\u0020\u0020\u0020</gml:LinearRing>
\u0020\u0020</gml:outerBoundaryIs>
</gml:Polygon>
</c7></row><row><c1>*****</c1><c2>2017</c2><c3>*****</c3><c4>0.019</c4><c5>Põllusaar</c5><c6>2016.11.02 08:46:32</c6><c7><gml:Polygon>
\u0020\u0020<gml:outerBoundaryIs>
\u0020\u0020\u0020\u0020<gml:LinearRing>
\u0020\u0020\u0020\u0020\u0020\u0020<gml:coordinates>
\u0020\u0020\u0020\u0020\u0020\u0020</gml:coordinates>
\u0020\u0020\u0020\u0020</gml:LinearRing>
\u0020\u0020</gml:outerBoundaryIs>
</gml:Polygon>
</c7></row>
Where should be the coordinates?
Normally and according to the SIARD2 standard LOBs can be stored three places :
I think the standard of the DBPTK is to store the LOBS inside the SIARD table folder structure if nothing else has been declared. So you should be able to find the LOBs in the table folder for the specific table?
Thank you Phillip for these explanations. Problem is that the lob folder "lob7" is not present.
It seems like a bug then. Have you tried to use the parameters in DBPTK to store lobs externally outside the .siard-file?
(edit: it also looks like there is no schema0? It might be related to the schema handling?)
@Laurira can you attach/send me the log?
I sent you the logs and siard to the email.
@PhillipTommerholt by default, DBPTK stores small lobs in the XML, bigger lobs are stored in their own file (inside the specified folder). The folder is specified in the metadata because if a lob is saved in a file, that's where it will be. In this case all the LOBs were small, so they were all inlined in XML.
About the schema0
, in the data files: rows' counting starts at 1, columns count also starts at 1, and the specification does not state that schemas/tables must start at 0, so to avoid confusion DBPTK always "starts counting" at 1.
The schemaX
or tableX
folder is specified in the <folder>
element in the metadata (for schemas as well as tables).
@Laurira I have created a new issue for the bug, see #371
Agencies have geodata within columns of tables. Usually we have Oracle databases with SDO_GEOMETRY data type and we need to take this information as well but we would prefer to take it as GML.