nulib / avalon

Variations-on-Video Hydra app
Apache License 2.0
3 stars 0 forks source link

Remediate Chambers List/ master file move #376

Closed carrickr closed 6 years ago

carrickr commented 6 years ago

Subtask of #373

Chicago_Chamber_Musicians_Archive has a , in the file name that ends up stored escaped in solr and messed up the aws s3 cp command. Deal with this.

carrickr commented 6 years ago

This code copies the chambers list over and removes any , in the filename, turns out both the collection names and files have , in them:

s = RSolr.connect url: 'http://solr.repo.vpc.rdc.library.northwestern.edu/solr//avalon'
r = (s.get 'select', params: {q: 'has_model_ssim:MasterFile', rows: 9999999})['response']['docs']

fix_list = []
r.each do |mf|
  fix_list << mf if mf['migrated_from_ssim'].blank?
end

chamber_list = []
fix_list.each do |i|
  chamber_list << i if i['file_location_ssi'].include? 'Chicago_Chamber_Musicians_Archive'
end

chamber_list.each do |i|
  old_location = i['file_location_ssi']
  old_location.gsub! '%2C',','
  new_location = "s3://preservation-cfx3wj9/avalon-masterfiles/#{i['isPartOf_ssim'].first}/#{i['file_location_ssi'].split('/').last}"
  new_location.gsub! '%2C',''
  new_location.gsub! '%2C',','
  cmd = "aws s3 cp #{old_location} #{new_location}"
  system(cmd) unless FileLocator.new(new_location).exists?
end

copy currently running

carrickr commented 6 years ago

Code for doing the master file update and deleting the old copy

chamber_list.each do |i|
  old_location = i['file_location_ssi']
  old_location.gsub! '%2C',','
  new_location = "s3://preservation-cfx3wj9/avalon-masterfiles/#{i['isPartOf_ssim'].first}/#{i['file_location_ssi'].split('/').last}"
  new_location.gsub! '%2C',''
  new_location.gsub! '%2C',','
  if FileLocator.new(new_location).exists?
    mf = MasterFile.find(i['id'])
    mf.file_location = new_location
    mf.masterFile = new_location
    mf.save
    puts "Updated #{i['id']}"
    cmd = "aws s3 rm #{old_location}"
    system(cmd)
    mf.update_index
  end
end
carrickr commented 6 years ago

These look to be done, checking for any pointing to the old location still (other unescaped characters, etc)