Open mbarnett opened 4 years ago
https://era-app-stg-1.library.ualberta.ca/oai?verb=ListRecords&metadataPrefix=oai_etdms
gives a 500. logs:
I, [2020-08-04T11:30:51.030587 #9771] INFO -- : [87bb55b7-a89a-4d15-b444-914611409724] Rendered vendor/ruby/2.5.0/bundler/gems/oaisys-d0e3f515472b/app/views/oaisys/pmh/list_records.xml.builder within layouts/oaisys/application (Duration: 1690.9ms | Allocations: 243817)
I, [2020-08-04T11:30:51.030638 #9771] INFO -- : [87bb55b7-a89a-4d15-b444-914611409724] Rendered vendor/ruby/2.5.0/bundler/gems/oaisys-d0e3f515472b/app/views/layouts/oaisys/application.builder (Duration: 1691.1ms | Allocations: 243946)
I, [2020-08-04T11:30:51.030864 #9771] INFO -- : [87bb55b7-a89a-4d15-b444-914611409724] Completed 500 Internal Server Error in 2601ms (ActiveRecord: 2297.4ms | Allocations: 251064)
F, [2020-08-04T11:30:51.032026 #9771] FATAL -- : [87bb55b7-a89a-4d15-b444-914611409724]
[87bb55b7-a89a-4d15-b444-914611409724] ActionView::Template::Error (undefined method `first' for nil:NilClass):
[87bb55b7-a89a-4d15-b444-914611409724] 30: item.member_of_paths.each { |path| header.setSpec path.tr('/', ':') }
[87bb55b7-a89a-4d15-b444-914611409724] 31: end
[87bb55b7-a89a-4d15-b444-914611409724] 32: record.metadata do |metadata_xml|
[87bb55b7-a89a-4d15-b444-914611409724] 33: item.serialize_metadata(format: metadata_format, into_document: metadata_xml)
[87bb55b7-a89a-4d15-b444-914611409724] 34: end
[87bb55b7-a89a-4d15-b444-914611409724] 35: end
[87bb55b7-a89a-4d15-b444-914611409724] 36: end
[87bb55b7-a89a-4d15-b444-914611409724]
[87bb55b7-a89a-4d15-b444-914611409724] app/decorators/metadata/oai_etdms/thesis_decorator.rb:47:in `discipline'
cannot assume object.departments.first
is valid with prod data
We're missing the publisher field on each thesis... from https://www.bac-lac.gc.ca/eng/services/theses/Pages/universities.aspx:
Publisher | Mandatory | The full name of the university that granted the degree. If possible, hard-code or standardize the field to prevent errors and variations in the university name. |
---|
some records, like the one below, are missing both the degree name and grantor:
<record>
<header>
<identifier>oai:era.library.ualberta.ca:5a85be9d-ebef-4f6d-81bf-20503f563bbd</identifier>
<datestamp>2020-07-20 22:46:28 UTC</datestamp>
<setSpec>db9a4e71-f809-4385-a274-048f28eb6814:f42f3da6-00c3-4581-b785-63725c33c7ce</setSpec>
</header>
<metadata>
<etd_ms:thesis xmlns:etd_ms="http://www.ndltd.org/standards/metadata/etdms/1.0/" xmlns:xsi2="http://www.w3.org/2001/XMLSchema-instance" xsi2:schemaLocation="http://www.ndltd.org/standards/metadata/etdms/1.0/ http://www.ndltd.org/standards/metadata/etdms/1-0/etdms.xsd">
<etd_ms:title>Induction of alcohol dehydrogenase, lactate dehydrogenase, and alanine aminotransferase gene expression of Arabidopsis thatliana exposed to hypoxia</etd_ms:title>
<etd_ms:creator>Spryland, Kathleen Anne.</etd_ms:creator>
<etd_ms:date>2020-07-20 22:46:28 UTC</etd_ms:date>
<etd_ms:type>Thesis</etd_ms:type>
<etd_ms:identifier>https://era-test.library.ualberta.ca/items/5a85be9d-ebef-4f6d-81bf-20503f563bbd</etd_ms:identifier>
<etd_ms:identifier>doi:10.7939/R3TP9M</etd_ms:identifier>
<etd_ms:identifier>https://era-test.library.ualberta.ca/items/5a85be9d-ebef-4f6d-81bf-20503f563bbd/view/82616ba5-cf77-444e-a7b2-b7bf3e8269ae/MQ21209.pdf</etd_ms:identifier>
<etd_ms:language>English</etd_ms:language>
<etd_ms:rights>This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.</etd_ms:rights>
</etd_ms:thesis>
</metadata>
</record>
from https://www.bac-lac.gc.ca/eng/services/theses/Pages/universities.aspx:
Degree name | Mandatory | Name of the degree associated with the thesis. Abbreviations are preferred, and abbreviated parts consisting of more than a single letter should be separated by a space from the preceding or succeeding words or initials. |
---|
Degree grantor | Mandatory | Name of the institution that awarded the degree. Use the name of the university from the time the degree was granted. |
---|
Results from the perl validator:
# RUNNING VALIDATION FOR https://era-app-stg-1.library.ualberta.ca/oai
### Checking Identify response
REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=Identify GET
PASS: Administrator email address is 'eraadmi@ualberta.ca'
PASS: Correctly reports OAI-PMH protocol version 2.0
FAIL: baseURL supplied 'https://era-app-stg-1.library.ualberta.ca/oai' does not match the baseURL in the Identify response 'https://era.library.ualberta.ca/oai'. The baseURL you enter must EXACTLY match the baseURL returned in the Identify response. It must match in case (http://Wibble.org/ does not match http://wibble.org/) and include any trailing slashes etc.
PASS: Datestamp granularity is 'seconds'
PASS: Extracted earliestDatestamp 2018-06-22T13:06:50Z
### Checking ListSets response
REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListSets GET
PASS: responseDate has correct format: 2020-08-07T19:28:36Z
PASS: Extracted 150 set names: { b41cdbfd-6af2-4a13-8ba6-59725565d445:adbab43d-b35d-4493-a3a5-bd228863cc36 560f321f-a8c7-4884-adb3-326433a61688:17bd1d5d-7d41-40ed-8ea2-97c2ec63896b f7766168-d234-491a-b27c-2c4a5eecbc99:d7e84f98-9931-435b-89fd-80f713d5ca47 ... }, will use setSpec &set=b41cdbfd-6af2-4a13-8ba6-59725565d445:adbab43d-b35d-4493-a3a5-bd228863cc36 in tests
### Checking ListIdentifiers response
REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListIdentifiers&metadataPrefix=oai_dc&set=b41cdbfd-6af2-4a13-8ba6-59725565d445:adbab43d-b35d-4493-a3a5-bd228863cc36 GET
PASS: responseDate has correct format: 2020-08-07T19:28:37Z
NOTE: Tried empty set, will look for set with items...
NOTE: Trying set &set=560f321f-a8c7-4884-adb3-326433a61688:17bd1d5d-7d41-40ed-8ea2-97c2ec63896b
REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListIdentifiers&metadataPrefix=oai_dc&set=560f321f-a8c7-4884-adb3-326433a61688:17bd1d5d-7d41-40ed-8ea2-97c2ec63896b GET
PASS: responseDate has correct format: 2020-08-07T19:28:37Z
PASS: Good ListIdentifiers response, extracted id 'oai:era.library.ualberta.ca:5a0aad85-bffa-4686-bae3-d589a64361dc' for use in future tests.
### Checking ListMetadataFormats response
REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListMetadataFormats&identifier=oai%3Aera%2Elibrary%2Eualberta%2Eca%3A5a0aad85-bffa-4686-bae3-d589a64361dc GET
PASS: responseDate has correct format: 2020-08-07T19:28:37Z
PASS: Good ListMetadataFormats response, includes oai_dc
PASS: Data provider supports oai_dc metadataPrefix
### Checking GetRecord response
REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=GetRecord&identifier=oai%3Aera%2Elibrary%2Eualberta%2Eca%3A5a0aad85-bffa-4686-bae3-d589a64361dc&metadataPrefix=oai_dc GET
FAIL: Server failed to respond to the GetRecord request (HTTP header values: status=404 Not Found, age=0, lifetime=0, is fresh:=)
FAIL: Can't complete datestamp check for GetRecord
FAIL: ABORT: Can't complete datestamp check for GetRecord
oops, validation didn't run to completion: ABORT: Can't complete datestamp check for GetRecord
## Validation status of data provider https://era-app-stg-1.library.ualberta.ca/oai is FAILED
Failures:
FAIL: baseURL supplied 'https://era-app-stg-1.library.ualberta.ca/oai' does not match the baseURL in the Identify response 'https://era.library.ualberta.ca/oai'. The baseURL you enter must EXACTLY match the baseURL returned in the Identify response. It must match in case (http://Wibble.org/ does not match http://wibble.org/) and include any trailing slashes etc.
REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=GetRecord&identifier=oai%3Aera%2Elibrary%2Eualberta%2Eca%3A5a0aad85-bffa-4686-bae3-d589a64361dc&metadataPrefix=oai_dc GET
FAIL: Server failed to respond to the GetRecord request (HTTP header values: status=404 Not Found, age=0, lifetime=0, is fresh:=)
FAIL: Can't complete datestamp check for GetRecord
FAIL: ABORT: Can't complete datestamp check for GetRecord
First one will be resolved once it's on prod and the rest are due to get record not finding a matching record with that identifier due to the prefixed oai:era.library.ualberta.ca: on the identifiers.
The script which went through all pages of list records for items succeeded.
The script which went through all pages of list records for theses succeed once the departments/discipline issue was fixed. The only issues found were the missing degree name/grantor/publisher fields.
Summary of findings above:
Get record does not work fully right now. Issue: Cant find a matching record with that identifier due to the prefixed oai:era.library.ualberta.ca: on the identifiers. It works when the identifier is just the uuid.
Identify is working fully.
List Identifiers is working fully.
List metadata formats is working fully.
List records works for the metadata prefix oai_dc but not oai_etdms. Issues with oai_etdms:
app/decorators/metadata/oai_etdms/thesis_decorator.rb:47:in
discipline'`.List sets is working fully.
How should these issues be worked through? As new tickets or part of this one?
Perl validator ddin't run fully previously and there were tests that didnt pass that weren't run before. Two major issues were fixed where the date in get record and list records was not in the proper format. This is the result of the test after those were fixed:
REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=Identify GET PASS: Administrator email address is 'eraadmi@ualberta.ca' PASS: Correctly reports OAI-PMH protocol version 2.0 FAIL: baseURL supplied 'https://era-app-stg-1.library.ualberta.ca/oai' does not match the baseURL in the Identify response 'https://era.library.ualberta.ca/oai'. The baseURL you enter must EXACTLY match the baseURL returned in the Identify response. It must match in case (http://Wibble.org/ does not match http://wibble.org/) and include any trailing slashes etc. PASS: Datestamp granularity is 'seconds' PASS: Extracted earliestDatestamp 2018-06-22T13:06:50Z
REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListSets GET PASS: responseDate has correct format: 2020-09-21T16:28:39Z PASS: Extracted 150 set names: { b41cdbfd-6af2-4a13-8ba6-59725565d445:adbab43d-b35d-4493-a3a5-bd228863cc36 560f321f-a8c7-4884-adb3-326433a61688:17bd1d5d-7d41-40ed-8ea2-97c2ec63896b f7766168-d234-491a-b27c-2c4a5eecbc99:d7e84f98-9931-435b-89fd-80f713d5ca47 ... }, will use setSpec &set=b41cdbfd-6af2-4a13-8ba6-59725565d445:adbab43d-b35d-4493-a3a5-bd228863cc36 in tests
REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListIdentifiers&metadataPrefix=oai_dc&set=b41cdbfd-6af2-4a13-8ba6-59725565d445:adbab43d-b35d-4493-a3a5-bd228863cc36 GET PASS: responseDate has correct format: 2020-09-21T16:28:40Z NOTE: Tried empty set, will look for set with items... NOTE: Trying set &set=560f321f-a8c7-4884-adb3-326433a61688:17bd1d5d-7d41-40ed-8ea2-97c2ec63896b REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListIdentifiers&metadataPrefix=oai_dc&set=560f321f-a8c7-4884-adb3-326433a61688:17bd1d5d-7d41-40ed-8ea2-97c2ec63896b GET PASS: responseDate has correct format: 2020-09-21T16:28:41Z PASS: Good ListIdentifiers response, extracted id 'oai:era.library.ualberta.ca:2cb91317-d9e3-4b6a-b3a1-7f96aa812a6e' for use in future tests.
REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListMetadataFormats&identifier=oai%3Aera%2Elibrary%2Eualberta%2Eca%3A2cb91317-d9e3-4b6a-b3a1-7f96aa812a6e GET PASS: responseDate has correct format: 2020-09-21T16:28:42Z PASS: Good ListMetadataFormats response, includes oai_dc PASS: Data provider supports oai_dc metadataPrefix
REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=GetRecord&identifier=oai%3Aera%2Elibrary%2Eualberta%2Eca%3A2cb91317-d9e3-4b6a-b3a1-7f96aa812a6e&metadataPrefix=oai_dc GET PASS: responseDate has correct format: 2020-09-21T16:28:43Z PASS: Datestamp in GetRecord response (2020-09-03T19:06:35Z) has the correct form for seconds granularity. PASS: Datestamp in GetRecord response (2020-09-03T19:06:35Z) matched the seconds granularity specified in the Identify response. PASS: Expected setSpec was returned in the response
REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListRecords&from=2020-09-03T19:06:35Z&until=2020-09-03T19:06:35Z&metadataPrefix=oai_dc GET PASS: responseDate has correct format: 2020-09-21T16:28:44Z PASS: Response is well formed PASS: ListRecords response correctly included record with identifier oai:era.library.ualberta.ca:2cb91317-d9e3-4b6a-b3a1-7f96aa812a6e
REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?junk GET PASS: Error response correctly includes error code 'badVerb' REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=junk GET PASS: Error response correctly includes error code 'badVerb' REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=GetRecord&metadataPrefix=oai_dc GET PASS: Error response correctly includes error code 'badArgument' REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=GetRecord&identifier=oai:era.library.ualberta.ca:2cb91317-d9e3-4b6a-b3a1-7f96aa812a6e GET PASS: Error response correctly includes error code 'badArgument' REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=GetRecord&identifier=invalid"id&metadataPrefix=oai_dc GET PASS: Error response correctly includes error code 'idDoesNotExist' REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListIdentifiers&until=junk GET PASS: Error response correctly includes error code 'badArgument' REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListIdentifiers&from=junk GET PASS: Error response correctly includes error code 'badArgument' REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListIdentifiers&resumptionToken=junk&until=2000-02-05 GET PASS: Error response correctly includes error code 'badResumptionToken' REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListRecords&metadataPrefix=oai_dc&from=junk GET WARN: Bad HTTP status code from server: 500 FAIL: Can't parse malformed response. REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListRecords&resumptionToken=junk GET PASS: Error response correctly includes error code 'badResumptionToken' REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListRecords&metadataPrefix=oai_dc&resumptionToken=junk&until=1990-01-10 GET PASS: Error response correctly includes error code 'badResumptionToken' REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListRecords&metadataPrefix=oai_dc&until=junk GET FAIL: Exception/error response did not contain error code 'badArgument' REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListRecords GET PASS: Error response correctly includes error code 'badArgument' WARN: Only 11 out of 13 error requests properly handled
REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2002-02-05&until=2002-02-06T05:35:00Z GET FAIL: Error code badArgument not found in response but should be given to the request: verb=ListRecords&metadataPrefix=oai_dc&from=2002-02-05&until=2002-02-06T05:35:00Z The request has different granularities for the from and until parameters. REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListRecords&metadataPrefix=oai_dc&until=2017-06-22T13:06:50Z GET PASS: Error response correctly includes error code 'noRecordsMatch'
REQUEST: https://era-app-stg-1.library.ualberta.ca/oai POST verb:Identify FAIL: POST test 1 was unsuccessful. Server returned HTTP Status: '422 Unprocessable Entity' REQUEST: https://era-app-stg-1.library.ualberta.ca/oai POST identifier:oai:era.library.ualberta.ca:2cb91317-d9e3-4b6a-b3a1-7f96aa812a6e metadataPrefix:oai_dc verb:GetRecord FAIL: POST test 2 was unsuccessful. Server returned HTTP Status: '422 Unprocessable Entity'
REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListRecords&metadataPrefix=oai_dc GET NOTE: Got resumptionToken vOT6kOeC8cqfv1zljKAHbj REQUEST: https://era-app-stg-1.library.ualberta.ca/oai?verb=ListRecords&resumptionToken=vOT6kOeC8cqfv1zljKAHbj GET PASS: Resumption tokens appear to work
Those tests that are failing are edge cases and can probably be put off for now. They should probably be addressed at some point though.
I believe it is ready for metadata to verify the OAI outpuit.
Looks good. I agree that those look like error-handling edge cases that we can put off for now, but maybe just open a ticket and record them there so that we don't lose track
Once Data has been migrated on the new Staging environment, we'll need to test the new OAI implementation to find any errors that may appear with Production-quality data.
Steps