aws / aws-cli

Universal Command Line Interface for Amazon Web Services
Other
15.54k stars 4.12k forks source link

Why aws s3 cp does not accept multiple sources? #1542

Open quiver opened 9 years ago

quiver commented 9 years ago

This is a feature request.

It would be great if s3 cp command accepts multiple sources just like bash cp command. For example

$ aws s3 cp a b s3://BUCKET/
upload: ./a to s3://BUCKET/a
upload: ./b to s3://BUCET/b
$ aws s3 cp a* s3://BUCKET/
upload: ./a1 to s3://BUCKET/a1
upload: ./a2 to s3://BUCKET/a2
rayluo commented 9 years ago

We can take this into consideration. Before that, your workaround will be use shell script to achieve similar effect.

ejoncas commented 8 years ago

This would be cool, specially if the cli can handle those cp in parallel. For example at the moment I want to copy 13K files from different S3 locations. They are all in the same bucket but they are not in the same folder so I have to write one 'aws s3 cp' command for each file and it takes a lot of time to run.

The commands that I'm running are something like this:

aws s3 cp s3://example-bucket/0-200M/A.json.gz s3://example-bucket/output-dir/
aws s3 cp s3://example-bucket/1000M-1500M/B.json.gz s3://example-bucket/output-dir/
aws s3 cp s3://example-bucket/another-dir/C.json.gz s3://example-bucket/output-dir/
aws s3 cp s3://example-bucket/0-200M/D.json.gz s3://example-bucket/output-dir/
aws s3 cp s3://example-bucket/1000M-1500M/E.json.gz s3://example-bucket/output-dir/
aws s3 cp s3://example-bucket/another-dir/F.json.gz s3://example-bucket/output-dir/
aws s3 cp s3://example-bucket/another-dir/H.json.gz s3://example-bucket/output-dir/
... 13K lines more with the same command, just changing the input s3 file..

This approach takes a lot of time. Is there any workaround to this kind of issues? If not, I think the tool should support a batch-cp where you can specify a list (or maybe a file) with all the files that you want to copy.

greyhammer commented 7 years ago

I agree, I am doing something identical to @ejoncas, and while this isn't bad, The timout in between each cp task makes this a several hour process.

taha commented 7 years ago

Any updates on this?

yyolk commented 7 years ago

Besides scripting a loop for aws s3 cp, I've used aws s3 sync to accomplish this

aws s3 sync --exclude=* --include=a* s3://bucket/

you can provide multiple --excludes and --includes, Above I'm excluding everything then including what I want

aviatorBeijing commented 7 years ago

I'd say without supporting multiple files copy, the CLI is seriously crippled. There are literally no justifiable reasons of not supporting this, merely due to the laziness of AWS engineers, and bad project management of AWS CLI. No excuses! Shame on you AWS CLI folks.

ozbillwang commented 7 years ago

Fix @yyolk 's command issue: aws: error: too few arguments

Suppose you need to sync the current folder to s3 bucket, add . as source.

aws s3 sync --exclude=* --include=a* . s3://bucket/
ASayre commented 6 years ago

Good Morning!

We're closing this issue here on GitHub, as part of our migration to UserVoice for feature requests involving the AWS CLI.

This will let us get the most important features to you, by making it easier to search for and show support for the features you care the most about, without diluting the conversation with bug reports.

As a quick UserVoice primer (if not already familiar): after an idea is posted, people can vote on the ideas, and the product team will be responding directly to the most popular suggestions.

We’ve imported existing feature requests from GitHub - Search for this issue there!

And don't worry, this issue will still exist on GitHub for posterity's sake. As it’s a text-only import of the original post into UserVoice, we’ll still be keeping in mind the comments and discussion that already exist here on the GitHub issue.

GitHub will remain the channel for reporting bugs.

Once again, this issue can now be found by searching for the title on: https://aws.uservoice.com/forums/598381-aws-command-line-interface

-The AWS SDKs & Tools Team

This entry can specifically be found on UserVoice at: https://aws.uservoice.com/forums/598381-aws-command-line-interface/suggestions/33168382-why-aws-s3-cp-does-not-accept-multiple-sources

salmanwaheed commented 6 years ago

This message was created automatically by mail delivery software.

A message that you sent could not be delivered to one or more of its recipients. This is a temporary error. The following address(es) deferred:

mkdirenv@gmail.com Domain salmanwaheed.info has exceeded the max emails per hour (150/150 (100%)) allowed. Message will be reattempted later

------- This is a copy of the message, including all the headers. ------ ------ The body of the message is 6173 characters long; only the first ------ 5000 or so are included here. Received: from github-smtp2-ext2.iad.github.net ([192.30.252.193]:53933 helo=github-smtp2b-ext-cp1-prd.iad.github.net) by box1177.bluehost.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.89_1) (envelope-from noreply@github.com) id 1ej0Or-001aTW-Fw for hello@salmanwaheed.info; Tue, 06 Feb 2018 03:22:54 -0700 Date: Tue, 06 Feb 2018 02:22:34 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=github.com; s=pf2014; t=1517912554; bh=1Kwm4VO+JPgtYylbgo7s3UaRgKxjnczSbJfF6ZeTkvo=; h=From:Reply-To:To:Cc:In-Reply-To:References:Subject:List-ID: List-Archive:List-Post:List-Unsubscribe:From; b=LkJKOOqN7Og6jAz60fFc+9T1hwFyxuvompSQ+YC/lRvNnNEl/Qfwk5zcqrecVAau0 b7Tn4g8n8sHzPuJqf8ALYbSVZScPSYi+QplKjGIGW9SW8+P7+lWX8ZdTaTI9Z8B8CY /lPIB+B8P+D2KZiVIczniq+ayUGHoYL0ud9dOB8M= From: Andre Sayre notifications@github.com Reply-To: aws/aws-cli reply@reply.github.com To: aws/aws-cli aws-cli@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Message-ID: aws/aws-cli/issues/1542/363377746@github.com In-Reply-To: aws/aws-cli/issues/1542@github.com References: aws/aws-cli/issues/1542@github.com Subject: Re: [aws/aws-cli] Why aws s3 cp does not accept multiple sources? (#1542) Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="--==_mimepart_5a7981ea4fab3_2745c3ff519788f2c1872dc"; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: list X-GitHub-Sender: ASayre X-GitHub-Recipient: salmanwaheed X-GitHub-Reason: subscribed List-ID: aws/aws-cli List-Archive: https://github.com/aws/aws-cli List-Post: mailto:reply@reply.github.com List-Unsubscribe: mailto:unsub+00ef1b38632ae6b51ec6041a4d4a40993ccd0cfb148e0ed292cf00000001169143ea92a169ce0686dc95@reply.github.com, https://github.com/notifications/unsubscribe/AO8bOAWJTkRWstvO5YakOqaIBdroPpdRks5tSCfqgaJpZM4GIAdP X-Auto-Response-Suppress: All X-GitHub-Recipient-Address: hello@salmanwaheed.info X-Spam-Status: No, score=-1.1 X-Spam-Score: -10 X-Spam-Bar: - X-Ham-Report: Spam detection software, running on the system "box1177.bluehost.com", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see root\@localhost for details.

Content preview: Good Morning! We're closing this issue here on GitHub, as part of our migration to UserVoice for feature requests involving the AWS CLI. [...]

Content analysis details: (-1.1 points, 5.0 required)

pts rule name description


0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: uservoice.com] -0.5 SPF_PASS SPF: sender matches SPF record 0.0 HTML_MESSAGE BODY: HTML included in message -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.5 AWL AWL: Adjusted score from AWL reputation of From: address X-Spam-Flag: NO

----==_mimepart_5a7981ea4fab3_2745c3ff519788f2c1872dc Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

Good Morning!

We're closing this issue here on GitHub, as part of our migration to Use= rVoice for feature requests involving the AWS CLI.

This will let us get the most important features to you, by making it eas= ier to search for and show support for the features you care the most abo= ut, without diluting the conversation with bug reports.

As a quick UserVoice primer (if not already familiar): after an idea is p= osted, people can vote on the ideas, and the product team will be respond= ing directly to the most popular suggestions.

We=E2=80=99ve imported existing feature requests from GitHub - Search for= this issue there!

And don't worry, this issue will still exist on GitHub for posterity's sa= ke. As it=E2=80=99s a text-only import of the original post into UserVoi= ce, we=E2=80=99ll still be keeping in mind the comments and discussion th= at already exist here on the GitHub issue.

GitHub will remain the channel for reporting bugs. =

Once again, this issue can now be found by searching for the title on: ht= tps://aws.uservoice.com/forums/598381-aws-command-line-interface =

-The AWS SDKs & Tools Team

-- =

You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/aws/aws-cli/issues/1542#issuecomment-363377746=

----==_mimepart_5a7981ea4fab3_2745c3ff519788f2c1872dc Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

Good Morning!

We're closing this issue here on GitHub, as part of our migration to <= a href=3D"https://aws.uservoice.com/forums/598381-aws-command-line-interf= ace" rel=3D"nofollow">UserVoice for feature requests involving the AW= S CLI.

This will let us get the most important features to you, by making it = easier to search for and show support for the features you care the most = about, without diluting the conversation with bug reports.

As a quick UserVoice primer (if not already familiar): after an idea i= s posted, people can vote on the ideas, and the product team will be resp= onding directly to the most popular suggestions.

We=E2=80=99ve imported existing feature requests from GitHub - Search = for this issue there!

And don't worry, this issue will still exist on GitHub for posterity's= sake. As it=E2=80=99s a text-only import of the original post into User= Voice, we=E2=80=99ll still be keeping in mind the comments and discussion= that already exist here on the GitHub issue.

GitHub will remain the channel for reporting bugs.

Once again, this issue can now be found by searching for the title on:= https://aws.uservoice.com/forums/598381-aws-comma= nd-line-interface

-The AWS SDKs & Tools Team

<p style=3D"font-size:small;-webkit-text-size-adjust:none;color:#666;">&m= dash;
You are receiving this because you are subscribed to this thre= ad.
Reply to this email directly, <a href=3D"https://github.com/aws/= aws-cli/issues/1542#issuecomment-363377746">view it on GitHub, or <a = href=3D"https://github.com/notifications/unsubscribe-auth/AO8bOC6j1pZh0pv= Uvuxl4xnHDXUB6swQks5tSCfqgaJpZM4GIAdP">mute the thread.<img alt=3D"" = height=3D"1" src=3D"https://github.com/notifications/beacon/AO8bOJ_1J-Cks= R1K6ngaKX0L68hrmAxBks5tSCfqgaJpZM4GIAdP.gif" width=3D"1" />

<div itemscope itemtype=3D"http://schema.org/EmailMessage"> <div itemprop=3D"action" itemscope itemtype=3D"http://schema.org/ViewActi= on"> <link itemprop=3D"url" href=3D"https://github.com/aws/aws-cli/issues/15= 42#issuecomment-363377746"> <meta itemprop=3D"name" content=3D"View Issue">
<meta itemprop=3D"description" content=3D"View this Issue on GitHub"></me= ta>

<script type=3D"application/json" data-scope=3D"inboxmarkup">{"api_versio= n":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name"= :"GitHub"},"entity":{"external_key":"github/aws/aws-cli","title":"aws/aws= -cli","subtitle":"GitHub repository","main_image_url":"https://cloud.gith= ubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c= 7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/= 143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name= ":"Open in GitHub","url":"https://github.com/aws/aws-cli"}},"updates":{"s= nippets":[{"icon":"PERSON","message":"@ASayre in #1542: Good Morning!\r\n= \r\nWe're closing this issue here on GitHub, as part of our migration to = UserVoice for feature requests involving the AWS CLI.\r\n\r\nThis will let u= s get the most important features

salmanwaheed commented 6 years ago

This message was created automatically by mail delivery software.

A message that you sent could not be delivered to one or more of its recipients. This is a temporary error. The following address(es) deferred:

mkdirenv@gmail.com Domain salmanwaheed.info has exceeded the max emails per hour (153/150 (102%)) allowed. Message will be reattempted later

------- This is a copy of the message, including all the headers. ------ Received: from github-smtp2-ext1.iad.github.net ([192.30.252.192]:44085 helo=github-smtp2a-ext-cp1-prd.iad.github.net) by box1177.bluehost.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.89_1) (envelope-from noreply@github.com) id 1ej0P0-001aTi-Kr for hello@salmanwaheed.info; Tue, 06 Feb 2018 03:23:03 -0700 Date: Tue, 06 Feb 2018 02:22:34 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=github.com; s=pf2014; t=1517912554; bh=pWP4xDtiKXHQSy5juE+AbxyIGefKShljpTc8iQaYHts=; h=From:Reply-To:To:Cc:In-Reply-To:References:Subject:List-ID: List-Archive:List-Post:List-Unsubscribe:From; b=gFrcHPWgRwBeCD3VFHf40K6h5eeqoeVFvoGqqCiGWq9KO6NZU/7ccJLTnF9noHblG oRi0mUqq4K39TBXSbxnXc+qomhNzBT8bmwGZbILwqg7FwLKWs5yNQ7ob9z1h9+PzGm zp2wmaWQ6JexvP3Zhxzdp+xSEzYsLDh8nBkuQNDU= From: Andre Sayre notifications@github.com Reply-To: aws/aws-cli reply@reply.github.com To: aws/aws-cli aws-cli@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Message-ID: aws/aws-cli/issue/1542/issue_event/1459788389@github.com In-Reply-To: aws/aws-cli/issues/1542@github.com References: aws/aws-cli/issues/1542@github.com Subject: Re: [aws/aws-cli] Why aws s3 cp does not accept multiple sources? (#1542) Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="--==_mimepart_5a7981eac5c4d_1ffd2ad9784d8ed4406080"; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: list X-GitHub-Sender: ASayre X-GitHub-Recipient: salmanwaheed X-GitHub-Reason: subscribed List-ID: aws/aws-cli List-Archive: https://github.com/aws/aws-cli List-Post: mailto:reply@reply.github.com List-Unsubscribe: mailto:unsub+00ef1b38632ae6b51ec6041a4d4a40993ccd0cfb148e0ed292cf00000001169143ea92a169ce0686dc95@reply.github.com, https://github.com/notifications/unsubscribe/AO8bOAWJTkRWstvO5YakOqaIBdroPpdRks5tSCfqgaJpZM4GIAdP X-Auto-Response-Suppress: All X-GitHub-Recipient-Address: hello@salmanwaheed.info X-Spam-Status: No, score=0.5 X-Spam-Score: 5 X-Spam-Bar: / X-Ham-Report: Spam detection software, running on the system "box1177.bluehost.com", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see root\@localhost for details.

Content preview: Closed #1542. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/aws/aws-cli/issues/1542#event-1459788389 Closed #1542. [...]

Content analysis details: (0.5 points, 5.0 required)

pts rule name description


0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: github.com] -0.5 SPF_PASS SPF: sender matches SPF record 0.0 HTML_MESSAGE BODY: HTML included in message 0.7 HTML_IMAGE_ONLY_20 BODY: HTML: images with 1600-2000 bytes of words -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 2.5 DCC_CHECK No description available. -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -2.1 AWL AWL: Adjusted score from AWL reputation of From: address X-Spam-Flag: NO

----==_mimepart_5a7981eac5c4d_1ffd2ad9784d8ed4406080 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit

Closed #1542.

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/aws/aws-cli/issues/1542#event-1459788389 ----==_mimepart_5a7981eac5c4d_1ffd2ad9784d8ed4406080 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit

Closed #1542.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

----==_mimepart_5a7981eac5c4d_1ffd2ad9784d8ed4406080--

jamesls commented 6 years ago

Based on community feedback, we have decided to return feature requests to GitHub issues.

oren-icx commented 5 years ago

Any update on this? Would still be great.

sp2410 commented 5 years ago

I think sync and cp are like sword and needle. They have different use-cases. @ejoncas Your case was similar to mine. Its a use case of copy and not sync. Store all the paths of individual files separated by a new line in a separate file called file_with_all_paths.txt

Something like this:

s3://example-bucket/0-200M/A.json.gz s3://example-bucket/1000M-1500M/B.json.gz s3://example-bucket/another-dir/C.json.gz .... ... ..

A bash loop can read through that file one by one and run the cp command

for f in $(cat ~/path_to_the_file/file_with_all_paths.txt); do echo "Now moving file $f"; aws s3 cp $f s3://example-bucket/output-dir/; done

Although I am also a beginner, I did write a blog on how I accomplished it. Check it out here: http://www.onceaday.today/subjects/15/posts/152. If it helps someone, great!

FlorinAndrei commented 5 years ago

--include and --exclude are cute and useful, but only when there's a discernable pattern to the file names. If it's just a random-looking list of names, they're useless.

What would help tremendously would be the ability to read a list of source files from a file. Or just accept multiple source files as arguments - but reading that whole list from a file would be much more powerful.

aws s3 cp --source-files long_list.txt s3://bucket_name/

This needs to work with source files that are either local or in a bucket.

The CLI would then absolutely need to do batch copies, if the API allows it.

sunnytambi commented 5 years ago

My suggestion --

  1. aws s3 cp --source-files long_list.txt s3://bucket_name/
  2. aws s3 cp "file1.xls,file2.jpg,file3.txt,file4.html" s3://bucket_name/
kahnchana commented 4 years ago

Has there been an update enabling to do this yet?

brianngo333 commented 4 years ago

This would be cool, specially if the cli can handle those cp in parallel. For example at the moment I want to copy 13K files from different S3 locations. They are all in the same bucket but they are not in the same folder so I have to write one 'aws s3 cp' command for each file and it takes a lot of time to run.

The commands that I'm running are something like this:

aws s3 cp s3://example-bucket/0-200M/A.json.gz s3://example-bucket/output-dir/
aws s3 cp s3://example-bucket/1000M-1500M/B.json.gz s3://example-bucket/output-dir/
aws s3 cp s3://example-bucket/another-dir/C.json.gz s3://example-bucket/output-dir/
aws s3 cp s3://example-bucket/0-200M/D.json.gz s3://example-bucket/output-dir/
aws s3 cp s3://example-bucket/1000M-1500M/E.json.gz s3://example-bucket/output-dir/
aws s3 cp s3://example-bucket/another-dir/F.json.gz s3://example-bucket/output-dir/
aws s3 cp s3://example-bucket/another-dir/H.json.gz s3://example-bucket/output-dir/
... 13K lines more with the same command, just changing the input s3 file..

This approach takes a lot of time. Is there any workaround to this kind of issues? If not, I think the tool should support a batch-cp where you can specify a list (or maybe a file) with all the files that you want to copy.

Thanks a lot @ejoncas for your answer which help me solved my problem ! <3

tim-finnigan commented 2 years ago

I'd also like to see something like a --source-files parameter but found this one line bash loop to be a useful workaround for now:

for file in $(cat filenames.txt); do {aws s3 cp $file s3://bucket-name } done

(very similar to what @sp2410 mentioned earlier)