instructure / canvas-lms

The open LMS by Instructure, Inc.
https://github.com/instructure/canvas-lms/wiki
GNU Affero General Public License v3.0
5.71k stars 2.52k forks source link

Move Canvas LMS files from tmp/files to Amazon S3 #1313

Open EugeneWHZ opened 6 years ago

EugeneWHZ commented 6 years ago

Hi Canvas Developers!

I need help with migration from local file storage to Amazon S3. It looks like file structure on Amazon S3 differs from file structure in tmp/files.

Please advice how to convert from tmp/files to s3 bucket structure. I saw similar request on https://groups.google.com/forum/#!topic/canvas-lms-users/Vh_xYj2J8uk with no solution.

Thanks! Eugene

amulgarg commented 6 years ago

Hi

We did something like this: MIGRATION_SOURCE_DIRECTORY = "tmp/files" MIGRATION_DESTINATION_DIRECTORY="YOUR DESIRED DIRECTORY"

  def self.create_folder_structure
    attachments = Attachment.all
    path_prefix = "#{ENV['MIGRATION_DESTINATION_DIRECTORY']}/"
    attachments.each do |attachment|
      #attachment = Attachment.find(304)
      src = attachment.filename
      padded_attachment_id = attachment.id.to_s.rjust(4,'0')
      source = "#{ENV['MIGRATION_SOURCE_DIRECTORY']}/#{padded_attachment_id}/#{src}"
      puts "source #{source}"
      attachment_path="#{path_prefix}#{attachment.namespace}/attachments/#{attachment.id}/"
      puts "destination #{attachment_path}"
      if File.exist?(source)
        puts "FILE"
        FileUtils.mkdir_p(attachment_path)
        FileUtils.cp(source, attachment_path)
      end
      thumb_name =  Thumbnail.find_by_parent_id(attachment.id).filename rescue nil 
      if thumb_name
        thumb_source = "#{ENV['MIGRATION_SOURCE_DIRECTORY']}/#{padded_attachment_id}/#{thumb_name}"
        puts "THUMBNAIL"
        # if attachment.context.is_a?(User)
          # thumbnail_path="#{path_prefix}/thumbnails/#{attachment.id}/"
        # else
          thumbnail_path="#{path_prefix}#{attachment.namespace}/public/thumbnails/#{attachment.id}/"
        #end
        if File.exist?(thumb_source)
          FileUtils.mkdir_p(thumbnail_path)
          FileUtils.cp(thumb_source, thumbnail_path)
        end
      end
    end
    puts "Finished creation of folder structure"
  end
EugeneWHZ commented 6 years ago

Hi! Can you please explain for a non-developer how to run this? Thank you!

RitaBu commented 6 years ago

Did someone managed to solve this? @EugeneWHZ?

rizkyekoputra commented 5 years ago

HI, any update about this? how to solve?

chewbakartik commented 5 years ago

I had a slightly different approach that I took to get the files copied over that hopefully will help someone else in a bind.

WARNING: This method doesn't retain the thumbnails for images (in the file explorer view). For us, that wasn't such a big deal to lose. If you're concerned about that, it might be best to sort out how to run the ruby script above.

The next piece is the actual command for copying the commands. It's important to note that you're going to have to go into each of the folders inside the files folder to run it for the next batch. It's a simple bash command that loops over each file and pushes it to your bucket.

So first go into 0000, then execute: for i in *; do aws s3 cp $i s3://{CHANGE_TO_YOUR_BUCKET}/account_1/attachments/"$i" --recursive; done

Next go into 0001, then execute: for i in *; do aws s3 cp $i s3://{CHANGE_TO_YOUR_BUCKET}/account_1/attachments/1"$i" -- recursive; done

It's REALLY important to notice the difference between these two prior commands, the last digit of your file folder, becomes the prefix for the place that the attachment is being stored in S3. I have bolded the #1 in the folder, you can see I also have a 1 right after the attachments/ piece. For each subsequent folder you go into, you need to modify the number to match the number on your file folder.

For 0002 it becomes: for i in *; do aws s3 cp $i s3://{CHANGE_TO_YOUR_BUCKET}/account_1/attachments/2"$i" -- recursive; done

I hope this helps someone else out in the future. The ruby scripts helped inspire the pieces for this command, and there was some trial/error along the way. But I was really happy to have a bash command to be able to execute.

I usually run this under screen, so that my session ending won't cause this to stop working.

schrink commented 4 years ago

Thank you @chewbakartik. Just one small correction.

So first go into 0000, then execute: for i in *; do aws s3 cp $i s3://{CHANGE_TO_YOUR_BUCKET}/account_1/attachments/"$i" --recursive; done

This first one will not actually work as it will copy folders with leading zero ie. 0001 while it should be just 1.

This is how I did the first one:

for i in *; do rclone copy $i aws:saebelgrade/account_1/attachments/$(echo $i | sed 's/^0*//'); done

The other two seem to be good.

HusamAjour commented 4 years ago

Thank you @chewbakartik. Just one small correction.

So first go into 0000, then execute: for i in *; do aws s3 cp $i s3://{CHANGE_TO_YOUR_BUCKET}/account_1/attachments/"$i" --recursive; done

This first one will not actually work as it will copy folders with leading zero ie. 0001 while it should be just 1.

This is how I did the first one:

for i in *; do rclone copy $i aws:saebelgrade/account_1/attachments/$(echo $i | sed 's/^0*//'); done

The other two seem to be good.

This was very helpful. Thank you!

HusamAjour commented 4 years ago

I had a slightly different approach that I took to get the files copied over that hopefully will help someone else in a bind.

WARNING: This method doesn't retain the thumbnails for images (in the file explorer view). For us, that wasn't such a big deal to lose. If you're concerned about that, it might be best to sort out how to run the ruby script above.

  • First, I downloaded and installed the aws-cli application for the server. For that follow the instructions here (https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html).
  • Second you need to configure AWS-CLI so that it has the proper credentials for your AWS account that you have your S3 bucket in.
  • Third you need to set up the correct folders in your bucket, so navigate to S3, open your bucket and add these two folders account_1\attachments
  • Fourth, you navigate to your file directory, [canvas-home]\tmp\files

The next piece is the actual command for copying the commands. It's important to note that you're going to have to go into each of the folders inside the files folder to run it for the next batch. It's a simple bash command that loops over each file and pushes it to your bucket.

So first go into 0000, then execute: for i in *; do aws s3 cp $i s3://{CHANGE_TO_YOUR_BUCKET}/account_1/attachments/"$i" --recursive; done

Next go into 0001, then execute: for i in *; do aws s3 cp $i s3://{CHANGE_TO_YOUR_BUCKET}/account_1/attachments/1"$i" -- recursive; done

It's REALLY important to notice the difference between these two prior commands, the last digit of your file folder, becomes the prefix for the place that the attachment is being stored in S3. I have bolded the #1 in the folder, you can see I also have a 1 right after the attachments/ piece. For each subsequent folder you go into, you need to modify the number to match the number on your file folder.

For 0002 it becomes: for i in *; do aws s3 cp $i s3://{CHANGE_TO_YOUR_BUCKET}/account_1/attachments/2"$i" -- recursive; done

I hope this helps someone else out in the future. The ruby scripts helped inspire the pieces for this command, and there was some trial/error along the way. But I was really happy to have a bash command to be able to execute.

I usually run this under screen, so that my session ending won't cause this to stop working.

Saved my life man. Thanks a lot!

xzykz22 commented 4 years ago

Hi

We did something like this: MIGRATION_SOURCE_DIRECTORY = "tmp/files" MIGRATION_DESTINATION_DIRECTORY="YOUR DESIRED DIRECTORY"

  def self.create_folder_structure
    attachments = Attachment.all
    path_prefix = "#{ENV['MIGRATION_DESTINATION_DIRECTORY']}/"
    attachments.each do |attachment|
      #attachment = Attachment.find(304)
      src = attachment.filename
      padded_attachment_id = attachment.id.to_s.rjust(4,'0')
      source = "#{ENV['MIGRATION_SOURCE_DIRECTORY']}/#{padded_attachment_id}/#{src}"
      puts "source #{source}"
      attachment_path="#{path_prefix}#{attachment.namespace}/attachments/#{attachment.id}/"
      puts "destination #{attachment_path}"
      if File.exist?(source)
        puts "FILE"
        FileUtils.mkdir_p(attachment_path)
        FileUtils.cp(source, attachment_path)
      end
      thumb_name =  Thumbnail.find_by_parent_id(attachment.id).filename rescue nil 
      if thumb_name
        thumb_source = "#{ENV['MIGRATION_SOURCE_DIRECTORY']}/#{padded_attachment_id}/#{thumb_name}"
        puts "THUMBNAIL"
        # if attachment.context.is_a?(User)
          # thumbnail_path="#{path_prefix}/thumbnails/#{attachment.id}/"
        # else
          thumbnail_path="#{path_prefix}#{attachment.namespace}/public/thumbnails/#{attachment.id}/"
        #end
        if File.exist?(thumb_source)
          FileUtils.mkdir_p(thumbnail_path)
          FileUtils.cp(thumb_source, thumbnail_path)
        end
      end
    end
    puts "Finished creation of folder structure"
  end

Has anyone managed to run this script?

happydiass commented 3 years ago

I had a slightly different approach that I took to get the files copied over that hopefully will help someone else in a bind.

WARNING: This method doesn't retain the thumbnails for images (in the file explorer view). For us, that wasn't such a big deal to lose. If you're concerned about that, it might be best to sort out how to run the ruby script above.

  • First, I downloaded and installed the aws-cli application for the server. For that follow the instructions here (https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html).
  • Second you need to configure AWS-CLI so that it has the proper credentials for your AWS account that you have your S3 bucket in.
  • Third you need to set up the correct folders in your bucket, so navigate to S3, open your bucket and add these two folders account_1\attachments
  • Fourth, you navigate to your file directory, [canvas-home]\tmp\files

The next piece is the actual command for copying the commands. It's important to note that you're going to have to go into each of the folders inside the files folder to run it for the next batch. It's a simple bash command that loops over each file and pushes it to your bucket.

So first go into 0000, then execute: for i in *; do aws s3 cp $i s3://{CHANGE_TO_YOUR_BUCKET}/account_1/attachments/"$i" --recursive; done

Next go into 0001, then execute: for i in *; do aws s3 cp $i s3://{CHANGE_TO_YOUR_BUCKET}/account_1/attachments/1"$i" -- recursive; done

It's REALLY important to notice the difference between these two prior commands, the last digit of your file folder, becomes the prefix for the place that the attachment is being stored in S3. I have bolded the #1 in the folder, you can see I also have a 1 right after the attachments/ piece. For each subsequent folder you go into, you need to modify the number to match the number on your file folder.

For 0002 it becomes: for i in *; do aws s3 cp $i s3://{CHANGE_TO_YOUR_BUCKET}/account_1/attachments/2"$i" -- recursive; done

I hope this helps someone else out in the future. The ruby scripts helped inspire the pieces for this command, and there was some trial/error along the way. But I was really happy to have a bash command to be able to execute.

I usually run this under screen, so that my session ending won't cause this to stop working.

THANK YOU i use this loop it's work perfect

for tmp/files/0000/0020/file.txt become : for i in *; do aws s3 cp $i s3://mybucket/account_1/attachments/$(echo $i | sed 's/^0*//') --recursive; done Result: s3://mybucket/account_1/attachments/20/file.txt

for tmp/files/0001/0020/file.txt become : for i in *; do aws s3 cp $i s3://mybucket/account_1/attachments/1"$i" --recursive; done Result: s3://mybucket/account_1/attachments/10020/file.txt

for tmp/files/0020/0999/file.txt become : for i in *; do aws s3 cp $i s3://mybucket/account_1/attachments/20"$i" --recursive; done Result: s3://mybucket/account_1/attachments/200999/file.txt