Closed smrgeoinfo closed 9 years ago
When using http://schemas.usgin.org/validate/wfs one problem is that it is constructing the wrong wfs request based on what it is pulling out of the Get onlineResource
element from the GetCapabilities doc. It is constructing the url as
http://uat-ngds.reisys.com:8080/geoserver/BoreholeTemperature/wfs?&service=WFS&version=1.0.0&request=GetFeature&typename=BoreholeTemperature:BoreholeTemperature&maxfeatures=1
but it should be
http://uat-ngds.reisys.com/geoserver-srv/BoreholeTemperature/ows?service=WFS&version=1.0.0&request=GetFeature&typeName=BoreholeTemperature:BoreholeTemperature&maxFeatures=1
In the capabilities document for the service (e.g. http://uat-ngds.reisys.com/geoserver-srv/BoreholeTemperature/ows?service=WFS&version=1.0.0&request=GetCapabilities), the URLs for the online resource attributes for each request (xpath is /WFS_Capabilities/Capability/Request//DCPType/HTTP/Get/@onlineResource) need to have the http://uat-ngds.reisys.com/geoserver-srv/ host URL, not the http://uat-ngds.reisys.com:8080/geoserver host URL.
ALSO, the validation component needs to access the service URLs from the capabilities document @jessica-azgs can you see if this is actually the case?
POSSIBLE FIXES:
@smrazgs @jessica-azgs I've found a way to fix it. The geoserver does provide a proxy base url for this purpose.
Can you please re-test WFS validation ?
Thanks
@JihadMotii-REISys the WFS doesn't validate,
Error Element 'BoreholeTemperature:BoreholeTemperature', attribute 'fid': The attribute 'fid' is not allowed. Error Element 'BoreholeTemperature:wellname_': This element is not expected. Expected is one of ( BoreholeTemperature:WellName, BoreholeTemperature:APINo, BoreholeTemperature:HeaderURI ).
@smrazgs @jessica-azgs Regarding "Error Element 'BoreholeTemperature:wellname': This element is not expected. Expected is one of ( BoreholeTemperature:WellName, BoreholeTemperature:APINo, BoreholeTemperature:HeaderURI )." => The problem was that the file that I used as resource has a column name "ObservationURI,WellName ,APINo,HeaderURI, ..." that contains an empty space and when I upload this file to datastore, this empty space converts to "". http://uat-ngds.reisys.com/geoserver-srv/BoreholeTemperature/ows?service=WFS&version=1.0.0&request=GetFeature&typeName=ckan:BoreholeTemperature&maxFeatures=10
@JihadMotii-REISys Can you manually remove the trailing spaces for now in Well Name
and Production
and continue on with the other issues in getting the WFS to validate? I think I'll need to make a tweak to usginmodels so a file which has extra spaces in field names won't validate on upload. This is an outlier case.
@jessica-azgs Sure Jessica, I've done removing the trailing spaces and I ran WFS validation and it seems there is more other issues such: Error Element 'BoreholeTemperature:ElevationDF': '' is not a valid value of the atomic type 'xs:double'. Error Element 'BoreholeTemperature:pH': '' is not a valid value of the atomic type 'xs:double'. Error Element 'BoreholeTemperature:CirculationDuration': '' is not a valid value of the atomic type 'xs:double'. ..... Are these an issues for validation ? because it looks it's coming from the content of the file, right ?
@jessica-azgs @smrazgs @Lbookman Regarding the FID, it's an important field and It seems that I can't get rid of it in geoserver (http://osgeo-org.1560.x6.nabble.com/hide-the-field-FID-to-Openlayers-td3793997.html). However, I noticed one thing, that is when I changed the version of WFS from 1.0.0 to 1.1.0, the fid changed to gml:id and other structures changed as well. e.g: http://uat-ngds.reisys.com/geoserver-srv/BoreholeTemperature/ows?service=WFS&version=1.1.0&request=GetFeature&typeName=ckan:BoreholeTemperature&maxFeatures=10 But when I tried to validate WFS with 1.1.0 version, the WFS application validator throws an exception. Also, the other sources they're using 1.1.0 http://geothermal.isgs.illinois.edu/ArcGIS/services/aasggeothermal/NJBoreholeLithIntervals/MapServer/WFSServer?request=GetFeature&service=WFS&TypeName=BoreholeLithInterval When I validated this one http://geothermal.isgs.illinois.edu/ArcGIS/services/aasggeothermal/NJBoreholeLithIntervals/MapServer/WFSServer?request=GetCapabilities&service=WFS The validation succeed.
@JihadMotii-REISys The list of fields you're getting as not having the valid type are all empty fields so likely PostgreSQL is making the type for these fields as string. Then Geoserver does not know that these fields (even though they are empty) must have a specific type (as stated in the schema http://schemas.usgin.org/files/borehole-temperature-observation/1.5/BoreholeTemperature.xsd) or the WFS won't validate. We need to find a way around this.
@JihadMotii-REISys This was addressed in the spring in https://github.com/ngds/ckanext-ngds/issues/377 but I don't think it was ever solved.
@smrazgs @jessica-azgs, regarding the fid error, I ran a validation of an existing WFS (I used 1.0.0 instead of 1.1.0) http://geothermal.isgs.illinois.edu/ArcGIS/services/aasggeothermal/NJBoreholeLithIntervals/MapServer/WFSServer?request=GetCapabilities&service=WFS&version=1.0.0 the output for the v1.0.0 was => FID error too ... but when I ran the validation again for the same WFS link with version=1.1.0, the validation succeeded. is the v1.1.0 compatible with ckan dependencies/external APIs ? if yes, Shouldn't we use the WFS V1.1.0 instead of 1.0.0 ?
@JihadMotii-REISys Yes, I think we should be using 1.1.0 since all the current WFS uses 1.1.0. I'm not sure how we got started with 1.0.0.
@jessica-azgs @smrazgs I'll change the version in code to 1.1.0. However, this changes it won't fix the other empty fields as mentioned in ngds/ckanext-ngds#377. we still need to find a way around this separately.
@smrazgs @jessica-azgs @Lbookman I have deployed the new code for using WFS 1.1.0 in UAT server. http://uat-ngds.reisys.com/dataset/indiana-borehole-temperatures-wfs-1-1-0 http://uat-ngds.reisys.com/dataset/well-log-ar-wfs-1-1-0-0
I ran a WFS Validation only for WellLog and it succeeded. Here is the link: http://uat-ngds.reisys.com/geoserver-srv/WellLog/ows?service=WFS&version=1.1.0&request=GetCapabilities&typeName=WellLog:WellLog I guess the file used for WellLog has no empty fields, However i ran this just to make sure that the version 1.1.0 is the correct one.
same as #12
1/20 email: Jihad, Attached is a valid Borehole Temperature file, and the WFS service associated with it is http://services.azgs.az.gov/arcgis/services/aasggeothermal/AZBoreholeTemperatures/MapServer/WFSServer?request=GetCapabilities&service=WFS. But again, if the data types are not being populated correctly in PostGIS, the WFS services are NOT going to validate (unless the field types are guess correctly, which can occasionally happen). The data types can be found in the schemas for the given models here: http://schemas.usgin.org/models/ and the BoreholeTemperatures model is here: http://schemas.usgin.org/files/borehole-temperature-observation/1.5/BoreholeTemperature.xsd where the data types are given as the element name xs type for each field:
<xs:element name="BoreholeName" type="xs:string" minOccurs="0">
<xs:annotation>
<xs:documentation>The human-intelligible name of the borehole identified by the HeaderURI.</xs:documentation>
</xs:annotation>
</xs:element>
These are translated to string=text; double=decimal; dateTime=calendarDate in PostGIS. As you requested, here is the free application which you can use to validate WFS services alongside the WFS validator at http://schemas.usgin.org/validate/wfs: Free versions of XML Explorer and Notepad ++ available here: http://xmlexplorer.codeplex.com/, http://notepad-plus-plus.org/ a. Create a Get Feature Request: In the browser, change the WFS GetCapabilities URL by deleting “Capabilites” and replacing with “Feature” and adding to the end of the URL the layer name. This is shown below in bolded text: http://services.azgs.az.gov/ArcGIS/services/aasggeothermal/AZActiveFaults/MapServer/WFSServer?request=GetCapabilities&service=WFS http://services.azgs.az.gov/ArcGIS/services/aasggeothermal/AZActiveFaults/MapServer/WFSServer?request=GetFeature&service=WFS&TypeName=ActiveFault&MaxFeatures=2 c. Copy this URL. In XML Explorer “Open Url …”. Paste in the Url. Once loaded save the file. Open in Notepad++. d. From http://schemas.usgin.org/models/ save the schema (.xsd) to validate file against to a local file location. e. In Notepad++:
email 1/23 Maybe http://schemas.usgin.org/contentmodels.json is useful. It gives all of the models and their URIs. This is what usginmodels uses. The Readme for usginmodels states how to use the functions and gives example URIs as well. For example, to get an object with all the info about a particular model use usginmodels.get_model(uri). Does that help? -Jessica
@JihadMotii-REISys can you tell me where we are on this? Is there anything further ready for us to test? (Let's try to keep comments on GitHub if possible instead of emails.)
Email 2/5: Hi Christy I was able to upload and validate custom Borehole CSV file (See attached). UAT link - http://uat-ngds.reisys.com/dataset/test-borehole Essentially I updated missing missing fields with new test values based on the schema (http://schemas.usgin.org/models/#boreholetemperature) Now that this file is validated … it confirms my theory of Postgres column types. There are few of ways to move forward on this
Great, thanks Yatin. Might the last 2 ways be problematic; at what point does that dummy data get removed? If it exists in the PostGIS database, won’t it be pushed on to GeoServer for publication? For the 3rd option, is this something that you need help from Jessica to work on, or do you feel like you can take that on? My preference is the first option, as it seems like it will be the most solid way to ensure that PostGIS always has the correct data types, aside from any changes that the pre-validator might undergo. What might be your estimate for how much longer this option would take? Thank you, Christy
For 3rd options I was thinking of adding data in only first row of empty columns. Hopefully inconsequential data for example BoreHoleHeight can be 0.0000001. Advantage of this approach over first one is that we wont be customizing CKAN's DataPusher extension. So we can keep getting support from upstream changes. For 3rd option since Jessica did most of the development for the validator I was hoping she can work on that. Having said that first option is definitely more robust, I will try to explore if we can extend data-pusher extension without customizing it heavily. Regards, Yatin Khadilkar
@ykhadilkar-rei Great - let me know. It's looking like 1 might still be the best option. Be aware that in our NGDS specs, we replace the null number (double) values with "-9999" instead of "0.0000001". @smrazgs Could you comment on which of the 3 possible solutions Yatin indicates might make most sense to you?
@ykhadilkar-rei Steve and I spoke and agreed that 1 would be the best option - perhaps we can chat further about this on our weekly telecon tomorrow.
Notes from 20150210: WFS - extending the datapusher approach (option 1) - will be changing the stock CKAN extension. It is actually sloppy to hard-code that in, it can change a lot and will change a lot. It's not sustainable. The csv files are being modified anyway and that is probably the best choice. First validation, checks the fields. Updating the first row of numbers (double) and date fields. Will look into it and let us know.
@ykhadilkar-rei Jessica's code with this tool https://github.com/usgin/ExcelToNGDSServiceTool already does this to some extent - perhaps it would be helpful? @jessica-azgs
@ccaudill thanks for for the tool url ... I was thinking of same. I will start working on it next week.
any update on this task? thank you
@ccaudill looks like to install ExceltoNGDSServiceTool we need ArcGIS. @jessica-azgs is that correct? Is there any other way to develop and test?
I would contact @jessica-azgs, yes.
@ykhadilkar-rei The ExceltoNGDSServiceTool uses the usginmodels API and works with ArcGIS. I don't see why you'd need to look at this tool at all. What you need is in usginmodels. If you use the get_layer
method with a specified schema it will return an object with all of the layer information, including the field type for each field. Once you know the field type you should be able to set default values. get_layer
returns the information for a single layer but get_models
will return all layers for all models.
layer = usginmodels.get_layer("http://stategeothermaldata.org/uri-gin/aasg/xmlschema/activefault/1.1")
@jessica-azgs thanks for the information. I will checkout usginmodels.
@ykhadilkar-rei Can you please give us an update here? Can you tell us what the hold-up is on finishing up this issue?
Decision is to update usginmodels.validate_file to insert a dummy row of data as the first row to force the correct data type inferencing when CKAN sends the csv to Postgres to create a table.
see https://github.com/usgin/usginmodels/issues/5
@jessica-azgs @dan-olaru-reisys @FuhuXia
http://uat-ngds.reisys.com/geoserver/get-ogc-services?url=http%3A%2F%2Fuat-ngds.reisys.com%2Fgeoserver-srv%2FALWellLog%2Fows%3Fservice%3DWFS%26version%3D1.1.0%26request%3DGetCapabilities%26typeName%3DALWellLog%3AWellLog&workspace=ALWellLog Published services finally validated using XML spy!! I think we're finally done with this issue.
new NGDS CKAN/GeoServer WFS for testing at http://uat-ngds.reisys.com/geoserver-srv/BoreholeTemperature/ows?service=WFS&version=1.0.0&request=GetFeature&typeName=BoreholeTemperature:BoreholeTemperature&maxFeatures=10 the WFS response causes a validator 'unhandled exception' with the wfs validator at http://schemas.usgin.org/validate/wfs.
(with schema locations: xsi:schemaLocation="http://www.opengis.net/wfs http://schemas.opengis.net/wfs/1.0.0/wfs.xsd http://stategeothermaldata.org/uri-gin/aasg/xmlschema/boreholetemperature/1.5 http://schemas.usgin.org/files/borehole-temperature-observation/1.5/BoreholeTemperature.xsd "
Validation problems depend on the validation engine used. I'm not sure what engine the online WFS validation tool uses.
CRITICAL: Validating with Saxon-EE the only error that gets called out is BoreholeTemperature:wellname_ is an invalid element name. Should be BoreholeTemperature:WellName This suggests a problem in the CSV to PostGIS data loading.
Validation using Oxygen XML editor v14.2., with the Xerces validation engine...
validation errors are getting thrown on empty date values that have data type xs:dateTime (e.g. SpudDate, ReleaseDate) or xs:double. Not sure why the empty xs:string elements are ok but not the empty xs:dateTime elements. Is there a way to have geoserver not insert the empty optional elements
Schema problems (relative to normative schema for BoreholeTemperature at http://schemas.usgin.org/files/borehole-temperature-observation/1.5/BoreholeTemperature.xsd (this is what the validator uses I think; @jessica-azgs can you verify?). Exception may be due to validation problems. Here they are: gml:boundedBy -- gml:null is invalid element name, should be gml:Null -- its case sensitive. This may be a geoserver problem? Note that the saxon parser thinks gml:null (l.c.) is OK. Go figure...
'fid' attribute on BoreholeTemperature:BoreholeTemperature is not allowed. Might be a geoserver problem (some argument in calls to deploy service?)
BoreholeTemperature:wellname_ is an invalid element name. Should be BoreholeTemperature:WellName This suggests a problem in the CSV to PostGIS data loading.