daitss / core

DAITSS: Dark Archive In The Sunshine State
GNU General Public License v3.0
9 stars 2 forks source link

Descriptor fails daitss aip xml validation on Ripple after OS updates and patches #779

Closed szanati closed 8 years ago

szanati commented 8 years ago

I got the following error on a package on RIpple that has archived many times before after OS updates and patches were done:

descriptor fails daitss aip xml validation (805 errors) 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 55,77: cvc-elt.4.2: Cannot resolve 'representation' to a type definition for element 'object'. 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 51,37: TargetNamespace.1: Expecting namespace 'info:lc/xmlns/premis-v2', but the target namespace of the schema document is 'http://www.loc.gov/premis/v3'. 83,77: cvc-elt.4.2: Cannot resolve 'representation' to a type definition for element 'object'.

The patches list is below: Updated: clamav-db-0.99-3.el5.x86_64 Updated: tzdata-java-2015g-1.el5.x86_64 Updated: kernel-headers-2.6.18-408.el5.x86_64 Updated: tzdata-2015g-1.el5.x86_64 Updated: glibc-common-2.5-123.el5_11.3.x86_64 Updated: glibc-2.5-123.el5_11.3.x86_64 Updated: openssl-0.9.8e-37.el5_11.x86_64 Updated: nspr-4.10.8-2.el5_11.x86_64 Updated: nss-3.19.1-2.el5_11.x86_64 Updated: openldap-2.3.43-29.el5_11.x86_64 Updated: 1:java-1.7.0-openjdk-1.7.0.91-2.6.2.1.el5_11.x86_64 Updated: clamav-0.99-3.el5.x86_64 Updated: 30:bind-libs-9.3.6-25.P1.el5_11.5.x86_64 Updated: libsndfile-1.0.17-7.el5.x86_64 Updated: libvolume_id-095-14.33.el5_11.x86_64 Updated: kpartx-0.4.7-64.el5_11.x86_64 Updated: device-mapper-multipath-0.4.7-64.el5_11.x86_64 Updated: 30:bind-utils-9.3.6-25.P1.el5_11.5.x86_64 Updated: clamd-0.99-3.el5.x86_64 Updated: 1:java-1.7.0-openjdk-devel-1.7.0.91-2.6.2.1.el5_11.x86_64 Updated: openldap-clients-2.3.43-29.el5_11.x86_64 Updated: nss-tools-3.19.1-2.el5_11.x86_64 Updated: nscd-2.5-123.el5_11.3.x86_64 Updated: udev-095-14.33.el5_11.x86_64 Updated: glibc-headers-2.5-123.el5_11.3.x86_64 Updated: openssl-devel-0.9.8e-37.el5_11.x86_64 Updated: clamav-devel-0.99-3.el5.x86_64 Updated: glibc-devel-2.5-123.el5_11.3.x86_64 Updated: openldap-devel-2.3.43-29.el5_11.x86_64 Installed: kernel-2.6.18-408.el5.x86_64 Installed: kernel-devel-2.6.18-408.el5.x86_64 Updated: glibc-2.5-123.el5_11.3.i686 Updated: openssl-0.9.8e-37.el5_11.i686 Updated: nspr-4.10.8-2.el5_11.i386 Updated: nss-3.19.1-2.el5_11.i386 Updated: openldap-2.3.43-29.el5_11.i386 Installed: pcre-6.6-9.el5.i386 Updated: clamav-0.99-3.el5.i386 Updated: libvolume_id-095-14.33.el5_11.i386 Updated: clamav-devel-0.99-3.el5.i386 Updated: openssl-devel-0.9.8e-37.el5_11.i386 Updated: glibc-devel-2.5-123.el5_11.3.i386 Updated: perl-Git-1.8.2.1-2.el5.x86_64 Updated: git-1.8.2.1-2.el5.x86_64

cchou commented 8 years ago

Looks like LOC updates their PREMIS schema. Will need to change those schema references in DAITSS.

lydiam commented 8 years ago

The version of premis.xsd on darchive is an old one (version 2.2):

[lydiam@fclnx30 ~]$ ls -l /var/daitss/webcache/www.loc.gov/standards/premis/premis.xsd

-rw-r--r-- 1 daitss daitss 65130 May 27 2014 /var/daitss/webcache/www.loc.gov/standards/premis/premis.xsd

I saved off a copy in my /home/lydiam directory, and spoke with Bill K. about disabling Squid schema updates until we can resolve this situation.

I think that if we can replace the premis.xsd on ripple with the darchive version we’ll be OK. But, of course, we need to deal with DAITSS.

szanati commented 8 years ago

Thanks Carrol.

lydiam commented 8 years ago

I had assumed that premis.xsd had been updated on ripple, but webcache on ripple still has the old version. It’s looking like DAITSS isn’t using webcache for premis.xsd:

-rw-rw-r-- 1 daitss daitss 65130 Oct 3 2013 /var/daitss/webcache/www.loc.gov/standards/premis/premis.xsd

Because the version of premis.xsd in webcache is old.

Squid is running on ripple, but somehow AIP validation against premis.xsd is using the current version.

Makes no sense – why is there an old version in webcache and why isn’t DAITSS using that version?

From: Carol Chou [mailto:notifications@github.com] Sent: Thursday, January 21, 2016 4:41 PM To: daitss/core Subject: Re: [core] Descriptor fails daitss aip xml validation on Ripple after OS updates and patches (#779)

Looks like LOC updates their PREMIS schema. Will need to change those schema references in DAITSS.

— Reply to this email directly or view it on GitHubhttps://github.com/daitss/core/issues/779#issuecomment-173716859.

lydiam commented 8 years ago

On darchive there is a script, /opt/fda/etc/daitss.daily/reload-squid-cache, that is "cronned" from the daitss.daily directory, and included is premis.xsd. From the script's comments:

Force squid to reload it's cache of schemas. We do this by using

curl against the proxy; it sets a 'Pragma: no-cache' header, which

will force squid to reload these (it's possible to configure squid

to ignore the no-cache headers - don't do that). We generated the

list of schemas by mining the XmlResolution logs over the course of

5 months, which reports on the retreived schemas - here's the

command I used:

#

grep Retrieved xmlresolution.log \

| sed -e 's/.Retrieved //' -e 's/ for document file.//' \

| sort -u

#

Typically we'll run this script every Wednesday, since we've configured

squid to keep it's cache for seven days; thus we'll be sure to have

access to them over the weekend.

#

Since we run this in cron, we'll get emailed any error conditions.

Normally it is silent.

The premis schema was update on www.loc.gov on January 18:

 From: Digital-Preservation Announcement and Information List [mailto:DIGITAL-        PRESERVATION@JISCMAIL.AC.UK] On Behalf Of Peter McKinney
 Sent: Monday, January 18, 2016 2:48 AM
 To: DIGITAL-PRESERVATION@JISCMAIL.AC.UK
 Subject: FW: PREMIS v3 schema finalised

 Dear PREMIS Implementers (with apologies for cross posting),

 The PREMIS 3.0 XML schema is now official, and is available at 

http://www.loc.gov/standards/premis/premis.xsd

  It is also available at both
  http://www.loc.gov/standards/premis/v3/premis.xsd   and
  http://www.loc.gov/standards/premis/v3/premis-v3-0.xsd 
  as it is our practice to maintain both the most recent minor version (in this case 3.0) for each major version within the directory for that major version (in this case, V3), as well as a copy of each minor version with explicit version number; in this case they are identical because this is the first minor version for major version V3. 

  Many thanks to Ray Denenberg at the Library of Congress for carrying out this work. 

  Best wishes,
  Peter

  Peter McKinney | Digital Preservation Policy Analyst | Information and Knowledge Services
  National Library of New Zealand Te Puna Mātauranga o Aotearoa

I don't quite understand the mining of the XMLResolution logs in the context of the Squid script, and how this will affect the retrieval of the premis.xsd schema. I also wonder where Squid's cache is.

In an old email from Carol regarding a schema question I found the following statement:

"We need to copy mod-3-3.xsd and mix20.xml to snow." so it may be that Squid's cache is kept on snow for darchive.

lydiam commented 8 years ago

I did find the premis.xsd schema on the snow server:

-rw-r--r--+ 1 cchou daitssdev 65130 Oct 2 2013 /www/schema/gov/www.loc.gov/standards/premis/premis.xsd

cchou commented 8 years ago

I remember both productions and ripple are configured to use squid cache, that is probably in /var/spool/squid? Squid has some set up rules to determine when to expire and refresh the schema.

During the government shutdown in 2013, we copied LOC schema to the schema.fcla.edu (snow server?) and Dallas changed the squid's configuration to redirect LOC schema to snow until the government shutdown was over.

LOC has updated their PREMIS schema to V3, I would suggest to ask Sysadmin to look into squid cache to determine the version of the PREMIS schema. If ripple has been updated, production may be updated soon too. DAITSS may need to be redirect to snow until the AIP descriptor can generate PREMIS V3 conformed XML.

lydiam commented 8 years ago

Squid on darchive did update the premis schema yesterday afternoon. On Friday Bill and I were attempting to figure out how/where Squid kept schemas, and we did not get to the point of determining that Squid cached schemas in /var/spool/squid. Today I looked at /var/spool/squid with Darryl’s assitance, and we found that the premis 3.0 schema was cached in a binary file with a numeric name. That would explain why we could not find a v3 premis.xsd schema on ripple. We’ve found it now, in /var/spool/squid/00/00/0000000D. So now we know what Squid is doing.

What we don’t know is how to redirect Squid to look to snow for the Premis v.2 schema. The previous version of the PREMIS schema can be found at http://www.loc.gov/standards/premis/v2/premis-v2-3.xsd (The text file version of the PREMIS schema on our servers is v.2.2, the previous version. For whatever reason the webcache directory still exists on our servers and contains the other version of the schema, but it’s likely that it is no longer used.)

So the urgent question at hand is how to redirect Squid to go to snow for the v.2-3 version of premis.xsd. I see that snow has a v.2-1 of premis.xsd at: /www/schema/gov/www.loc.gov/standards/premis/premis.xsd. I can retrieve v2-3 and provide it.

Bill – who can work on Squid? I know you’re out sick today, and Darryl is unfamiliar with Squid.

Today’s we’re running FSCK’s on /var/daitss on darchive, so that gives us a little bit of time to resolve this. Once the FSCKs are finished we will be able to process new packages to the point of aip descriptor validation, at which point they’ll all error out. So production won’t be entirely halted, but individual packages won’t be able to run to completion.

Lydia

From: Carol Chou [mailto:notifications@github.com] Sent: Monday, January 25, 2016 1:28 PM To: daitss/core Cc: Lydia Motyka Subject: Re: [core] Descriptor fails daitss aip xml validation on Ripple after OS updates and patches (#779)

I remember both productions and ripple are configured to use squid cache, that is probably in /var/spool/squid? Squid has some set up rules to determine when to expire and refresh the schema.

During the government shutdown in 2013, we copied LOC schema to the schema.fcla.edu (snow server?) and Dallas changed the squid's configuration to redirect LOC schema to snow until the government shutdown was over.

LOC has updated their PREMIS schema to V3, I would suggest to ask Sysadmin to look into squid cache to determine the version of the PREMIS schema. If ripple has been updated, production may be updated soon too. DAITSS may need to be redirect to snow until the AIP descriptor can generate PREMIS V3 conformed XML.

— Reply to this email directly or view it on GitHubhttps://github.com/daitss/core/issues/779#issuecomment-174611343.

lydiam commented 8 years ago

I want to add that today I also had Darryl stop Squid on ripple, to see if possibly DAITSS would use the cached premis schema in ripple: /var/daitss/webcache/www.loc.gov/standards/premis/premis.xsd, but it did not. It worked for a while, and then came back with the same premis v3 error as before, so if Squid isn’t running perhaps DAITSS eventually goes out to the LOC site for the schema. (If it doesn’t do that I have no idea where DAITSS is getting the premis v3 schema.)

Lydia

From: Lydia Motyka Sent: Tuesday, January 26, 2016 10:53 AM To: 'daitss/core'; daitss/core Cc: Bill Kuntz; Darryl Marsee Subject: Bill - urgent RE: [core] Descriptor fails daitss aip xml validation on Ripple after OS updates and patches (#779)

Squid on darchive did update the premis schema yesterday afternoon. On Friday Bill and I were attempting to figure out how/where Squid kept schemas, and we did not get to the point of determining that Squid cached schemas in /var/spool/squid. Today I looked at /var/spool/squid with Darryl’s assitance, and we found that the premis 3.0 schema was cached in a binary file with a numeric name. That would explain why we could not find a v3 premis.xsd schema on ripple. We’ve found it now, in /var/spool/squid/00/00/0000000D. So now we know what Squid is doing.

What we don’t know is how to redirect Squid to look to snow for the Premis v.2 schema. The previous version of the PREMIS schema can be found at http://www.loc.gov/standards/premis/v2/premis-v2-3.xsd (The text file version of the PREMIS schema on our servers is v.2.2, the previous version. For whatever reason the webcache directory still exists on our servers and contains the other version of the schema, but it’s likely that it is no longer used.)

So the urgent question at hand is how to redirect Squid to go to snow for the v.2-3 version of premis.xsd. I see that snow has a v.2-1 of premis.xsd at: /www/schema/gov/www.loc.gov/standards/premis/premis.xsd. I can retrieve v2-3 and provide it.

Bill – who can work on Squid? I know you’re out sick today, and Darryl is unfamiliar with Squid.

Today’s we’re running FSCK’s on /var/daitss on darchive, so that gives us a little bit of time to resolve this. Once the FSCKs are finished we will be able to process new packages to the point of aip descriptor validation, at which point they’ll all error out. So production won’t be entirely halted, but individual packages won’t be able to run to completion.

Lydia

From: Carol Chou [mailto:notifications@github.com] Sent: Monday, January 25, 2016 1:28 PM To: daitss/core Cc: Lydia Motyka Subject: Re: [core] Descriptor fails daitss aip xml validation on Ripple after OS updates and patches (#779)

I remember both productions and ripple are configured to use squid cache, that is probably in /var/spool/squid? Squid has some set up rules to determine when to expire and refresh the schema.

During the government shutdown in 2013, we copied LOC schema to the schema.fcla.edu (snow server?) and Dallas changed the squid's configuration to redirect LOC schema to snow until the government shutdown was over.

LOC has updated their PREMIS schema to V3, I would suggest to ask Sysadmin to look into squid cache to determine the version of the PREMIS schema. If ripple has been updated, production may be updated soon too. DAITSS may need to be redirect to snow until the AIP descriptor can generate PREMIS V3 conformed XML.

— Reply to this email directly or view it on GitHubhttps://github.com/daitss/core/issues/779#issuecomment-174611343.

lydiam commented 8 years ago

I will take a look shortly, and see what I can find.

Bill

Sent from my phone.

-----Original Message----- From: Lydia Motyka [LMotyka@flvc.org] Received: Tuesday, 26 Jan 2016, 11:00AM To: 'daitss/core' [reply@reply.github.com]; 'daitss/core' [core@noreply.github.com] CC: Bill Kuntz [WKuntz@flvc.org]; Darryl Marsee [DMarsee@flvc.org] Subject: RE: Bill - urgent RE: [core] Descriptor fails daitss aip xml validation on Ripple after OS updates and patches (#779)

I want to add that today I also had Darryl stop Squid on ripple, to see if possibly DAITSS would use the cached premis schema in ripple: /var/daitss/webcache/www.loc.gov/standards/premis/premis.xsd, but it did not. It worked for a while, and then came back with the same premis v3 error as before, so if Squid isn’t running perhaps DAITSS eventually goes out to the LOC site for the schema. (If it doesn’t do that I have no idea where DAITSS is getting the premis v3 schema.)

Lydia

From: Lydia Motyka Sent: Tuesday, January 26, 2016 10:53 AM To: 'daitss/core'; daitss/core Cc: Bill Kuntz; Darryl Marsee Subject: Bill - urgent RE: [core] Descriptor fails daitss aip xml validation on Ripple after OS updates and patches (#779)

Squid on darchive did update the premis schema yesterday afternoon. On Friday Bill and I were attempting to figure out how/where Squid kept schemas, and we did not get to the point of determining that Squid cached schemas in /var/spool/squid. Today I looked at /var/spool/squid with Darryl’s assitance, and we found that the premis 3.0 schema was cached in a binary file with a numeric name. That would explain why we could not find a v3 premis.xsd schema on ripple. We’ve found it now, in /var/spool/squid/00/00/0000000D. So now we know what Squid is doing.

What we don’t know is how to redirect Squid to look to snow for the Premis v.2 schema. The previous version of the PREMIS schema can be found at http://www.loc.gov/standards/premis/v2/premis-v2-3.xsd (The text file version of the PREMIS schema on our servers is v.2.2, the previous version. For whatever reason the webcache directory still exists on our servers and contains the other version of the schema, but it’s likely that it is no longer used.)

So the urgent question at hand is how to redirect Squid to go to snow for the v.2-3 version of premis.xsd. I see that snow has a v.2-1 of premis.xsd at: /www/schema/gov/www.loc.gov/standards/premis/premis.xsd. I can retrieve v2-3 and provide it.

Bill – who can work on Squid? I know you’re out sick today, and Darryl is unfamiliar with Squid.

Today’s we’re running FSCK’s on /var/daitss on darchive, so that gives us a little bit of time to resolve this. Once the FSCKs are finished we will be able to process new packages to the point of aip descriptor validation, at which point they’ll all error out. So production won’t be entirely halted, but individual packages won’t be able to run to completion.

Lydia

From: Carol Chou [mailto:notifications@github.com] Sent: Monday, January 25, 2016 1:28 PM To: daitss/core Cc: Lydia Motyka Subject: Re: [core] Descriptor fails daitss aip xml validation on Ripple after OS updates and patches (#779)

I remember both productions and ripple are configured to use squid cache, that is probably in /var/spool/squid? Squid has some set up rules to determine when to expire and refresh the schema.

During the government shutdown in 2013, we copied LOC schema to the schema.fcla.edu (snow server?) and Dallas changed the squid's configuration to redirect LOC schema to snow until the government shutdown was over.

LOC has updated their PREMIS schema to V3, I would suggest to ask Sysadmin to look into squid cache to determine the version of the PREMIS schema. If ripple has been updated, production may be updated soon too. DAITSS may need to be redirect to snow until the AIP descriptor can generate PREMIS V3 conformed XML.

— Reply to this email directly or view it on GitHubhttps://github.com/daitss/core/issues/779#issuecomment-174611343.

szanati commented 8 years ago

I just submitted a new benchmark package on Ripple after squid was stopped on ripple and got a different message:

bad status http://xmlresolution.ripple.fcla.edu/ieids/E6MP1TBMK_JANOW8: 500

Phusion Passenger

Ruby (Rack) application could not be started

The application has exited during startup (i.e. during the evaluation of
config/environment.rb).

    The error message can be found below. To solve this problem, please
    follow any instructions in the error message.

Error message:

Proxy connection error. Resolver service will abort.

Application root:

        /opt/web-services/sites/xmlresolution/current

Backtrace:

#   File    Line    Location
0   /opt/web-services/sites/xmlresolution/releases/20140603142939/app.rb    56  in `abort'
1   /opt/web-services/sites/xmlresolution/releases/20140603142939/app.rb    56  in `rescue in setTestProxy'
2   /opt/web-services/sites/xmlresolution/releases/20140603142939/app.rb    46  in `setTestProxy'
3   /opt/web-services/sites/xmlresolution/releases/20140603142939/app.rb    75  in `block in '
4   /opt/web-services/sites/xmlresolution/shared/bundle/ruby/1.9.1/gems/sinatra-1.4.5/lib/sinatra/base.rb   1402    in `configure'
5   /opt/web-services/sites/xmlresolution/shared/bundle/ruby/1.9.1/gems/sinatra-1.4.5/lib/sinatra/base.rb   1982    in `block (2 levels) in delegate'
6   /opt/web-services/sites/xmlresolution/releases/20140603142939/app.rb    62  in `'
7   config.ru   2   in `require'
8   config.ru   2   in `block in
'
9   /opt/web-services/sites/xmlresolution/shared/bundle/ruby/1.9.1/gems/rack-1.5.2/lib/rack/builder.rb  55  in `instance_eval'
10  /opt/web-services/sites/xmlresolution/shared/bundle/ruby/1.9.1/gems/rack-1.5.2/lib/rack/builder.rb  55  in `initialize'
11  config.ru   1   in `new'
12  config.ru   1   in `
'
13  /opt/ruby-1.9.3-p545/lib/ruby/gems/1.9.1/gems/passenger-3.0.21/lib/phusion_passenger/rack/application_spawner.rb    225     in `eval'
14  /opt/ruby-1.9.3-p545/lib/ruby/gems/1.9.1/gems/passenger-3.0.21/lib/phusion_passenger/rack/application_spawner.rb    225     in `load_rack_app'
15  /opt/ruby-1.9.3-p545/lib/ruby/gems/1.9.1/gems/passenger-3.0.21/lib/phusion_passenger/rack/application_spawner.rb    75  in `block (2 levels) in spawn_application'
16  /opt/ruby-1.9.3-p545/lib/ruby/gems/1.9.1/gems/passenger-3.0.21/lib/phusion_passenger/utils.rb   563     in `report_app_init_status'
17  /opt/ruby-1.9.3-p545/lib/ruby/gems/1.9.1/gems/passenger-3.0.21/lib/phusion_passenger/rack/application_spawner.rb    73  in `block in spawn_application'
18  /opt/ruby-1.9.3-p545/lib/ruby/gems/1.9.1/gems/passenger-3.0.21/lib/phusion_passenger/utils.rb   470     in `safe_fork'
19  /opt/ruby-1.9.3-p545/lib/ruby/gems/1.9.1/gems/passenger-3.0.21/lib/phusion_passenger/rack/application_spawner.rb    64  in `spawn_application'
20  /opt/ruby-1.9.3-p545/lib/ruby/gems/1.9.1/gems/passenger-3.0.21/lib/phusion_passenger/spawn_manager.rb   264     in `spawn_rack_application'
21  /opt/ruby-1.9.3-p545/lib/ruby/gems/1.9.1/gems/passenger-3.0.21/lib/phusion_passenger/spawn_manager.rb   137     in `spawn_application'
22  /opt/ruby-1.9.3-p545/lib/ruby/gems/1.9.1/gems/passenger-3.0.21/lib/phusion_passenger/spawn_manager.rb   275     in `handle_spawn_application'
23  /opt/ruby-1.9.3-p545/lib/ruby/gems/1.9.1/gems/passenger-3.0.21/lib/phusion_passenger/abstract_server.rb     357     in `server_main_loop'
24  /opt/ruby-1.9.3-p545/lib/ruby/gems/1.9.1/gems/passenger-3.0.21/lib/phusion_passenger/abstract_server.rb     206     in `start_synchronously'
25  /opt/ruby-1.9.3-p545/lib/ruby/gems/1.9.1/gems/passenger-3.0.21/helper-scripts/passenger-spawn-server    99  in `
            Powered by Phusion Passenger,
            mod_rails / mod_rack for Apache and Nginx.

trace

/opt/web-services/sites/core/releases/20141119194610/lib/mixin/curb.rb:10:in error' /opt/web-services/sites/core/releases/20141119194610/lib/daitss/service/xmlres.rb:12:input_collection' /opt/web-services/sites/core/releases/20141119194610/lib/daitss/proc/wip/preserve.rb:45:in block in preserve' /opt/web-services/sites/core/releases/20141119194610/lib/daitss/proc/wip/journal.rb:16:instep' /opt/web-services/sites/core/releases/20141119194610/lib/daitss/proc/wip/preserve.rb:43:in preserve' /opt/web-services/sites/core/releases/20141119194610/lib/daitss/proc/wip/ingest.rb:33:iningest' /opt/web-services/sites/core/releases/20141119194610/lib/daitss/proc/wip/process.rb:82:in block in spawn' /opt/web-services/sites/core/releases/20141119194610/lib/daitss/proc/wip/process.rb:66:infork' /opt/web-services/sites/core/releases/20141119194610/lib/daitss/proc/wip/process.rb:66:in spawn' /opt/web-services/sites/core/current/bin/pulse:161:inblock in start_wips' /opt/web-services/sites/core/current/bin/pulse:158:in each' /opt/web-services/sites/core/current/bin/pulse:158:instart_wips' /opt/web-services/sites/core/current/bin/pulse:194:in block in ' /opt/web-services/sites/core/current/bin/pulse:192:inloop' /opt/web-services/sites/core/current/bin/pulse:192:in `

cchou commented 8 years ago

I don't think /var/daitss/webcache/ is used by DAITSS any more, we changed DAITSS to use squid long time ago.

Dallas did the LOC URI redirect in squid to the snow server. So perhaps look into squid configuration on how to redirect LOC. I found Dallas's note in my email.

"This should now be fixed, modulo any missing older .xsd files which weren't in our schema backups.

Notes for future reference:

DAITSS squid daemons run with the default cache configuration, which should be:

cache_dir ufs /var/spool/squid 100 16 256

This specifies a "ufs" (standard squid disk storage) format, 100 megabytes max disk storage, spread across 16+256 directories, using its standard hashing algorithm. The file<->URI map is written into /var/spool/squid/swap.state, but from what I've read is more of a memory-map image used by the squid process. Thus, attempting to restore /var/spool/squid including the swap.state from backup may not result in a usable cache.

Thus, we took an alternate route, and enabled squid "redirection" of URIs.

redirect_program /usr/local/sbin/squid-redirect.pl This is a simple perl program which rewrites any URI containing http://www.loc.gov to a new URI on http://schema.fcla.edu. For some reason, it didn't like being rewritten to web.archive.org URIs -- possibly due to the high number ot 302 redirects they use.

Once LOC is functional again, we need to comment out the redirect_program from the squid config on fclnx30 and ripple, then restart the squid daemons.

Dallas "

szanati commented 8 years ago

We found the redirect_program on ripple and Darryl uncommented out the redirect to snow. I was able to archive the package on ripple.

lydiam commented 8 years ago

I’ve copied the MODS v3-5 and v3-6 schemas onto snow. The current MODS schemas on snow are: mods-3-0.xsd mods-3-2.xsd mods-3-3.xsd mods-3-4.xsd mods-3-5.xsd mods-3-6.xsd That’s probably all we’ll need, because MODS v.3-6 is the most recent version.

Lydia

szanati commented 8 years ago

The redirect_program has been uncommented out on Production to redirect to snow. I tested several packages that archived.

cchou commented 8 years ago

code enhancement to make DAITSS compatible to PREMIS v3 has been committed, tested and roll out to prod.