Esri / geoportal-server

Geoportal Server is a standards-based, open source product that enables discovery and use of geospatial resources including data and services.
https://gptogc.esri.com/geoportal
Apache License 2.0
244 stars 149 forks source link

Return more than 5000 records on CSW GetRecords POST request #193

Closed revenger73 closed 8 years ago

revenger73 commented 8 years ago

We're unable to return more than 5000 records on a single CSW GetRecords Post request, even when specifying a large number (int.maxvalue) for maxRecords!

<csw:GetRecords xmlns:csw='http://www.opengis.net/cat/csw/2.0.2' version='2.0.2' service='CSW' resultType='results' startPosition='1' maxRecords='2147483647'><csw:Query typeNames='csw:Record' xmlns:ogc='http://www.opengis.net/ogc' xmlns:gml='http://www.opengis.net/gml'> <csw:ElementSetName>full</csw:ElementSetName><csw:Constraint version='1.1.0'> <ogc:Filter><ogc:Or><ogc:And><ogc:PropertyIsEqualTo><ogc:PropertyName>ScenarioResource</ogc:PropertyName><ogc:Literal>false</ogc:Literal></ogc:PropertyIsEqualTo></ogc:And><ogc:And><ogc:PropertyIsNull><ogc:PropertyName>ScenarioResource</ogc:PropertyName></ogc:PropertyIsNull></ogc:And></ogc:Or></ogc:Filter> </csw:Constraint> <ogc:SortBy><ogc:SortProperty><ogc:PropertyName>OwnerCode</ogc:PropertyName><ogc:SortOrder>ASC</ogc:SortOrder></ogc:SortProperty><ogc:SortProperty><ogc:PropertyName>Title</ogc:PropertyName><ogc:SortOrder>ASC</ogc:SortOrder></ogc:SortProperty></ogc:SortBy> </csw:Query></csw:GetRecords> 

Returns only first 5000 records in response

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<csw:GetRecordsResponse xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcmiBox="http://dublincore.org/documents/2000/07/11/dcmi-box/" xmlns:dct="http://purl.org/dc/terms/" xmlns:gml="http://www.opengis.net/gml" xmlns:ows="http://www.opengis.net/ows" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<csw:SearchStatus timestamp="2016-01-12T08:45:17+01:00"/>
<csw:SearchResults elementSet="full" nextRecord="5001" numberOfRecordsMatched="15652" numberOfRecordsReturned="5000" recordSchema="http://www.opengis.net/cat/csw/2.0.2">

Is it possible to return all records in a single GetRecords request?

Kind Regards,

Remko van Ginneken

mhogeweg commented 8 years ago

Set startPosition='5001' will get the next set of records.

'''

''' Set startPosition='5001' will get the next set of records. Marten
revenger73 commented 8 years ago

Marten,

That means a fix in our Production code. We have maxRecords set on a Large number to get all records in one request instead of 4 separate requests.

Never read or seen that Geoportal Server would at max return 5000 records?

If there is no setting to increase that we have to do a hotfix for the problems we have in production because of recent increase in metadata with 10000 records and now we are above 5000.

Kind regards

Remko van Ginneken

Op 12 jan. 2016 om 22:16 heeft Marten notifications@github.com het volgende geschreven:

Set startPosition='5001' will get the next set of records.

'''

''' Set startPosition='5001' will get the next set of records. Marten — Reply to this email directly or view it on GitHub.
mhogeweg commented 8 years ago

hi Remko,

please include this in gpt.xml inside the catalog element (there are others there already called lucene....

<parameter key="lucene.maxrecords.threshold" value="100000"/>

This configuration parameter is used in com.esri.gpt.catalog.discovery.DiscoveryComponent to determine a max number of records to return. 5000 is the default max.

revenger73 commented 8 years ago

Thx! That is what we we're looking for.

Remko

Op 13 jan. 2016 om 21:02 heeft Marten notifications@github.com het volgende geschreven:

hi Remko,

please include this in gpt.xml inside the catalog element (there are others there already called lucene....

This configuration parameter is used in com.esri.gpt.catalog.discovery.DiscoveryComponent to determine a max number of records to return. 5000 is the default max.

— Reply to this email directly or view it on GitHub.

revenger73 commented 8 years ago

Added setting to gpt.xml and restarted server.

Cs-w getrecords still only returns 5000 records. So setting seems to be ignored in Cs-w getrecords!

Anything else we can try with Geoportal Server.

Else we have to start working on a hotfix in our software.

Remko

Op 13 jan. 2016 om 21:02 heeft Marten notifications@github.com het volgende geschreven:

hi Remko,

please include this in gpt.xml inside the catalog element (there are others there already called lucene....

This configuration parameter is used in com.esri.gpt.catalog.discovery.DiscoveryComponent to determine a max number of records to return. 5000 is the default max.

— Reply to this email directly or view it on GitHub.

pandzel-zz commented 8 years ago

Fixed for upcoming 1.2.7