MaRDI4NFDI / python-zbMathRest2Oai

Read data from the zbMATH Open API https://api.zbmath.org/docs and feed it to the OAI-PMH server https://oai.portal.mardi4nfdi.de/oai/
GNU General Public License v3.0
4 stars 0 forks source link

Validate endpoint with OAI standard validation tool #101

Closed physikerwelt closed 1 month ago

physikerwelt commented 1 month ago

Ensure that the validation with

https://www.openarchives.org/Register/ValidateSite

works.

Currently, I see the following error:


Initial validation checks (step 1)

baseURL is http://oai.portal.mardi4nfdi.de/oai/OAIHandler
Validation only
Request logged from 10.92.111.58
Checking Identify response
REQUEST http://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=Identify GET
FAIL Server at base URL 'http://oai.portal.mardi4nfdi.de/oai/OAIHandler' failed to respond to Identify. The HTTP GET request with URL http://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=Identify received response code '301'. HTTP code 301 'Moved Permanently' is not widely supported by harvesters and is anyway inappropriate for registration of a service. If requests must be redirected then an HTTP response 302 may be used as outlined in the guidelines [http://www.openarchives.org/OAI/2.0/guidelines-repository.htm#LoadBalancing].
FAIL ABORT: Failed to get Identify response from server at base URL 'http://oai.portal.mardi4nfdi.de/oai/OAIHandler'.

The OAI-PMH data provider with base URL http://oai.portal.mardi4nfdi.de/oai/OAIHandler has failed initial validation. Problems reported must be corrected before validation can continue.
physikerwelt commented 1 month ago

FAIL adminEmail element is empty!

physikerwelt commented 1 month ago

this is now very green

baseURL is https://oai.portal.mardi4nfdi.de/oai/OAIHandler Validation only Request logged from 10.92.110.57 Checking Identify response REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=Identify GET PASS Administrator email address is 'maxence@zbmath.org' PASS Correctly reports OAI-PMH protocol version 2.0 PASS baseURL supplied matches the Identify response PASS Datestamp granularity is 'seconds' PASS Extracted earliestDatestamp 2000-01-01T00:00:00Z

physikerwelt commented 1 month ago

much better now

OAI Icon OAI-PMH Data Provider Validation and Registration Showing validation log Running validation checks (step 2)

baseURL is https://oai.portal.mardi4nfdi.de/oai/OAIHandler Validation only Request logged from 10.92.110.57 Checking Identify response REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=Identify GET PASS Administrator email address is 'maxence@zbmath.org' PASS Correctly reports OAI-PMH protocol version 2.0 PASS baseURL supplied matches the Identify response PASS Datestamp granularity is 'seconds' PASS Extracted earliestDatestamp 2000-01-01T00:00:00Z Checking ListSets response REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=ListSets GET PASS responseDate has correct format: 2024-10-11T16:00:50Z FAIL Failed to extract any setSpec elements from ListSets but did not find an exception message. If sets are not supported by the repository then the ListSets response must be the noSetHierarchy error. See . Checking ListIdentifiers response REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=ListIdentifiers&metadataPrefix=oai_dc GET PASS responseDate has correct format: 2024-10-11T16:00:51Z PASS Good ListIdentifiers response, extracted id '10.5072/38239' for use in future tests. Checking ListMetadataFormats response REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=ListMetadataFormats&identifier=10%2E5072/38239 GET PASS responseDate has correct format: 2024-10-11T16:00:53Z PASS Good ListMetadataFormats response, includes oai_dc PASS Data provider supports oai_dc metadataPrefix Checking GetRecord response REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=GetRecord&identifier=10%2E5072/38239&metadataPrefix=oai_dc GET PASS responseDate has correct format: 2024-10-11T16:00:53Z PASS Datestamp in GetRecord response (2023-11-07T12:07:16Z) has the correct form for seconds granularity. PASS Datestamp in GetRecord response (2023-11-07T12:07:16Z) matched the seconds granularity specified in the Identify response. PASS Valid GetRecord response Checking ListRecords response REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=ListRecords&from=2023-11-07T12:07:16Z&until=2023-11-07T12:07:16Z&metadataPrefix=oai_dc GET PASS responseDate has correct format: 2024-10-11T16:00:54Z PASS Response is well formed PASS ListRecords response correctly included record with identifier 10.5072/38239 Checking exception handling (errors) REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?junk GET PASS Error response correctly includes error code 'badVerb' REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=junk GET PASS Error response correctly includes error code 'badVerb' REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=GetRecord&metadataPrefix=oai_dc GET PASS Error response correctly includes error code 'badArgument' REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=GetRecord&identifier=10.5072/38239 GET PASS Error response correctly includes error code 'badArgument' REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=GetRecord&identifier=invalid"id&metadataPrefix=oai_dc GET PASS Error response correctly includes error code 'idDoesNotExist' REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=ListIdentifiers&until=junk GET PASS Error response correctly includes error code 'badArgument' REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=ListIdentifiers&from=junk GET PASS Error response correctly includes error code 'badArgument' REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=ListIdentifiers&resumptionToken=junk&until=2000-02-05 GET PASS Error response correctly includes error code 'badArgument' REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=ListRecords&metadataPrefix=oai_dc&from=junk GET PASS Error response correctly includes error code 'badArgument' REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=ListRecords&resumptionToken=junk GET PASS Error response correctly includes error code 'badResumptionToken' REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=ListRecords&metadataPrefix=oai_dc&resumptionToken=junk&until=1990-01-10 GET PASS Error response correctly includes error code 'badArgument' REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=ListRecords&metadataPrefix=oai_dc&until=junk GET PASS Error response correctly includes error code 'badArgument' REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=ListRecords GET PASS Error response correctly includes error code 'badArgument' PASS All 13 error requests properly handled Checking for version 2.0 specific exceptions REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=ListRecords&metadataPrefix=oai_dc&from=2002-02-05&until=2002-02-06T05:35:00Z GET PASS Error response correctly includes error code 'badArgument' REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=ListRecords&metadataPrefix=oai_dc&until=1999-01-01T00:00:00Z GET PASS Error response correctly includes error code 'noRecordsMatch' Checking that HTTP POST requests are handled correctly REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler POST verb:Identify PASS POST test 1 for Identify was successful REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler POST identifier:10.5072/38239 metadataPrefix:oai_dc verb:GetRecord PASS POST test 2 for GetRecord was successful Checking for correct use of resumptionToken (if used) REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=ListRecords&metadataPrefix=oai_dc GET NOTE Got resumptionToken rows=50@@searchMark=1032@@from=0001-01-01T00:00:00Z@@total=14745@@until=9999-12-31T23:59:59Z@@metadataPrefix=oai_dc REQUEST https://oai.portal.mardi4nfdi.de/oai/OAIHandler?verb=ListRecords&resumptionToken=rows%3D50%40%40searchMark%3D1032%40%40from%3D0001-01-01T00%3A00%3A00Z%40%40total%3D14745%40%40until%3D9999-12-31T23%3A59%3A59Z%40%40metadataPrefix%3Doai_dc GET PASS Resumption tokens appear to work Summary - failure

Uses https URIs (not specified in protocol)
Total tests passed: 37
Total warnings: 0
Total error count: 1
Validation status: FAILED

Validation process complete Fri Oct 11 12:01:05 2024

Mazztok45 commented 1 month ago

As a reminder, the last error can be seen in: https://www.openarchives.org/Register/ValidateSite?log=W0LNQK5A

The error: _FAIL_ Failed to extract any setSpec elements from ListSets but did not find an exception message. If the repository does not support Sets, then the ListSets response must be the noSetHierarchy error.

If we say we have no Sets, then perhaps we should have a look there:

https://github.com/ER-FIZKarlsruhe/fiz-oai-provider/blob/master/src/main/java/ORG/oclc/oai/server/catalog/AbstractCatalog.java This functions seems ok

*/ public abstract Map listSets() throws NoSetHierarchyException, OAIInternalServerError; but I have a doubt on this one: public abstract Map listSets(String resumptionToken) throws BadResumptionTokenException, OAIInternalServerError;

I wonder if public abstract Map listSets(String resumptionToken) should throw NoSetHierarchyException

physikerwelt commented 1 month ago

We have sets; we just need to propagate those.

physikerwelt commented 1 month ago

I used 10 minutes to double-check the documentation. During document insertion, one can set tags and sets as comma-separated values. It was, however, not clear to me how those sets or tags relate to the things listed here: https://oai.portal.mardi4nfdi.de/oai/listSets @Mazztok45 Can you clarify this with @stefanbozic

Mazztok45 commented 1 month ago

The OAI-PMH test was successfully completed: https://www.openarchives.org/Register/ValidateSite?log=VBI196PU