Closed carrickr closed 6 years ago
This code copies the chambers list over and removes any ,
in the filename, turns out both the collection names and files have ,
in them:
s = RSolr.connect url: 'http://solr.repo.vpc.rdc.library.northwestern.edu/solr//avalon'
r = (s.get 'select', params: {q: 'has_model_ssim:MasterFile', rows: 9999999})['response']['docs']
fix_list = []
r.each do |mf|
fix_list << mf if mf['migrated_from_ssim'].blank?
end
chamber_list = []
fix_list.each do |i|
chamber_list << i if i['file_location_ssi'].include? 'Chicago_Chamber_Musicians_Archive'
end
chamber_list.each do |i|
old_location = i['file_location_ssi']
old_location.gsub! '%2C',','
new_location = "s3://preservation-cfx3wj9/avalon-masterfiles/#{i['isPartOf_ssim'].first}/#{i['file_location_ssi'].split('/').last}"
new_location.gsub! '%2C',''
new_location.gsub! '%2C',','
cmd = "aws s3 cp #{old_location} #{new_location}"
system(cmd) unless FileLocator.new(new_location).exists?
end
copy currently running
Code for doing the master file update and deleting the old copy
chamber_list.each do |i|
old_location = i['file_location_ssi']
old_location.gsub! '%2C',','
new_location = "s3://preservation-cfx3wj9/avalon-masterfiles/#{i['isPartOf_ssim'].first}/#{i['file_location_ssi'].split('/').last}"
new_location.gsub! '%2C',''
new_location.gsub! '%2C',','
if FileLocator.new(new_location).exists?
mf = MasterFile.find(i['id'])
mf.file_location = new_location
mf.masterFile = new_location
mf.save
puts "Updated #{i['id']}"
cmd = "aws s3 rm #{old_location}"
system(cmd)
mf.update_index
end
end
These look to be done, checking for any pointing to the old location still (other unescaped characters, etc)
Subtask of #373
Chicago_Chamber_Musicians_Archive
has a,
in the file name that ends up stored escaped in solr and messed up theaws s3 cp
command. Deal with this.