Closed tomkralidis closed 11 years ago
Tom—makes sense to me. I’d adjust the the ‘As it happens,…’ paragraph to say ‘query against all metadata, based on the typeNames schema, and return all results encoded using the outputSchema’. I’m pretty sure most implementations don’t behave this way, because to actually implement, one has to map the query schema elements from the typeName schema to the schema of each metadata schema used in the catalog, and has to transform from records in any schema stored in the dB to the output schema.
To address this issue, GeoPortal maps incoming metadata record elements to Lucene index elements and to build lucene indexes for each queryable property (I think Geonetwork does the same). GeoPortal doesn’t actually honor the outputSchema parameter. GeoNetwork provides output XSLT’s to transform the xml blob from the dB into the requested outputSchema if it is different from the schema for the XML blob in the dB.
Another solution to the problem is used by Deegree—marshal harvested metadata to a relational dB schema, map incoming requests from whatever typeName schema is supported to SQL against the relational dB, and have routines to build XML output in any supported outputSchema.
As I interpret the CSW spec, if the capabilities list an outputSchema, then the server needs to be able to provide any record in its metadata store in that schema.
steve
From: Tom Kralidis [mailto:notifications@github.com] Sent: Monday, February 11, 2013 10:51 AM To: geopython/pycsw Cc: Stephen Richard Subject: [pycsw] GetRecords handling should not filter records based on typenames value (#105)
The current behaviour for handling GetRecords.typename is to filter records based on typename before applying any OGC filters to the query. Example:
As is happens, GetRecords queries (with filter or not) should always query against all metadata, in any advertised outputschema (which we already do). This is confirmed w/ @smrAzGS https://github.com/smrAzGS ' comments as well as CSW spec authors.
So in the codebase, we need to remove the part of the repository query which initially filters by typename so that the entire repository is searched and not filtered by typenames.
@rclark https://github.com/rclark / @smrAzGS https://github.com/smrAzGS : does this make sense?
— Reply to this email directly or view it on GitHub https://github.com/geopython/pycsw/issues/105 .
Image removed by sender.
Thanks @smrAzGS. Updated. Will have this implemented by end of week.
Hi @smrAzGS thanks for the additional implementation comments. FYI pycsw does it the deegree way, and we write to any outputschema
in the same way (in Python, we refuse XSLT). We shred the XML in db columns and keep on hand the actual XML representation, which is used if the outputschema
requested is the same as the XML representation in the DB column, and when elementsetname=full
(as an early out).
FYI fixed in master and 1.4 branch.
The current behaviour for handling
GetRecords.typename
is to filter records based on typename before applying any OGC filters to the query. Example:typenames=csw:Record
, catalogue A returns the 10 DC recordstypenames=gmd:MD_Metadata
, catalogue A returns the 10 ISO recordstypenames=csw:Record,gmd:MD_Metadata
, catalogue A returns the 10 DC and 10 ISO records for a total of 20 recordsAs is happens, GetRecords queries (with filter or not) should always query against all metadata, based on the
typeNames
schema, and return all results encoded using theoutputSchema
(which we already do). This is confirmed w/ @smrAzGS' comments as well as CSW spec authors.So in the codebase, we need to remove the part of the repository query which initially filters by typename so that the entire repository is searched and not filtered by
typenames
.@rclark / @smrAzGS: does this make sense?