kpfaulkner / azurecopy

copy blobs between azure, s3 and local storage
Apache License 2.0
36 stars 13 forks source link

Error Copying Large Number of Files #7

Closed nightwallaby closed 7 years ago

nightwallaby commented 8 years ago

Hi,

I've tried copying over the ~1.5TB worth of S3 files to an Azure target and received the debug error below. There are ~6 million files in so maybe it was too much to handle? I saw the process use up to ~4GB of memory.

If I target a much smaller virtual directory it works.

PS C:\admin\azurecopy-1.1.5\azurecopy> .\azurecopy.exe -i https://s3.amazonaws.com/media/photos/ -o https://media.blob.core.windows.net/photos -v -db -blobcopy GetHandler start GetHandler retrieved azurecopy.S3Handler GetHandler start GetHandler retrieved azurecopy.AzureHandler Unknown error generated. Please report to Github page https://github.com/kpfaulkner/azurecopy/issues . Can view underly ing stacktrace by adding -db flag. at Amazon.Runtime.Internal.Unmarshaller.Unmarshall(IExecutionContext executionContext) at Amazon.Runtime.Internal.Unmarshaller.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext) at Amazon.S3.Internal.AmazonS3ResponseHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.ErrorHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.CallbackHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.Signer.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.CredentialsRetriever.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.RetryHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.CallbackHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext) at Amazon.S3.Internal.AmazonS3KmsHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.EndpointResolver.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext) at Amazon.S3.Internal.AmazonS3PostMarshallHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.Marshaller.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext) at Amazon.S3.Internal.AmazonS3PreMarshallHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.CallbackHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext) at Amazon.S3.Internal.AmazonS3ExceptionHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.ErrorCallbackHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.MetricsHandler.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.Internal.RuntimePipeline.InvokeSync(IExecutionContext executionContext) at Amazon.Runtime.AmazonServiceClient.Invoke[TRequest,TResponse](TRequest request, IMarshaller`2 marshaller, Response Unmarshaller unmarshaller) at Amazon.S3.AmazonS3Client.ListObjects(ListObjectsRequest request) at azurecopy.S3Handler.ListBlobsInContainer(String containerName, String blobPrefix, Boolean debug) at azurecopycommand.Program.GetSourceBlobList(IBlobHandler inputHandler) at azurecopycommand.Program.DoNormalCopy() at azurecopycommand.Program.Process(Boolean debugMode) at azurecopycommand.Program.Main(String[] args)

Cheers, Simon

kpfaulkner commented 8 years ago

Hi

I certainly haven't tried with volumes that large. Hmmm looking at the stack trace you supplied its the Amazon client library that is dying when trying to get the list of S3 blobs. The only thing I can suggest is perform multiple copies but targeting subdirectories.

ie if you have /media/photogroup1/.... and /media/photogroup2/.... then try 2 separate commands, one on each virtual directory.

I can investigate if there is an alternative way to get the list of blobs (maybe I could segment it within azurecopy itself), but that may take a little while to sort out.

Do you think performing multiple commands each targeting directories will work for you?

Thanks

Ken

On Wed, Jan 20, 2016 at 1:16 PM, Simon Horn notifications@github.com wrote:

Hi,

I've tried copying over the ~15TB worth of S3 files to an Azure target and received the debug error below There are ~6 million files in so maybe it was too much to handle? I saw the process use up to ~4GB of memory

If I target a much smaller virtual directory it works

PS C:\admin\azurecopy-115\azurecopy> \azurecopyexe -i https://s3amazonawscom/media/photos/ -o https://mediablobcorewindowsnet/photos -v -db -blobcopy GetHandler start GetHandler retrieved azurecopyS3Handler GetHandler start GetHandler retrieved azurecopyAzureHandler Unknown error generated Please report to Github page https://githubcom/kpfaulkner/azurecopy/issues Can view underly ing stacktrace by adding -db flag at AmazonRuntimeInternalUnmarshallerUnmarshall(IExecutionContext executionContext) at AmazonRuntimeInternalUnmarshallerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3ResponseHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalErrorHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalSignerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCredentialsRetrieverInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalRetryHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3KmsHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalEndpointResolverInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3PostMarshallHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalMarshallerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3PreMarshallHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3ExceptionHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalErrorCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalMetricsHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalRuntimePipelineInvokeSync(IExecutionContext executionContext) at AmazonRuntimeAmazonServiceClientInvokeTRequest,TResponse at AmazonS3AmazonS3ClientListObjects(ListObjectsRequest request) at azurecopyS3HandlerListBlobsInContainer(String containerName, String blobPrefix, Boolean debug) at azurecopycommandProgramGetSourceBlobList(IBlobHandler inputHandler) at azurecopycommandProgramDoNormalCopy() at azurecopycommandProgramProcess(Boolean debugMode) at azurecopycommandProgramMain(String[] args)

Cheers, Simon

— Reply to this email directly or view it on GitHub https://github.com/kpfaulkner/azurecopy/issues/7.

nightwallaby commented 8 years ago

Thank Ken.

I've been looking at the since posting the Issue. The problem I'm now facing is that if I do something like: azurecopy.exe -i https://s3.amazonaws.com/media/photos/2016 -o https://media.blob.core.windows.net/photos/2016

..I get the following stack trace:

GetHandler start GetHandler retrieved azurecopy.S3Handler GetHandler start GetHandler retrieved azurecopy.AzureHandler Unknown error generated. Please report to Github page https://github.com/kpfaulkner/azurecopy/issues . Can view underly ing stacktrace by adding -db flag. at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ExecuteSync[T](RESTCommand`1 cmd, IRetryPolicy policy, Opera tionContext operationContext) at Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob.StartCopyFromBlob(Uri source, AccessCondition sourceAccessCondi tion, AccessCondition destAccessCondition, BlobRequestOptions options, OperationContext operationContext) at azurecopy.AzureBlobCopyHandler.StartCopy(BasicBlobContainer origBlob, String DestinationUrl, DestinationBlobType d estBlobType) at azurecopycommand.Program.DoNormalCopy() at azurecopycommand.Program.Process(Boolean debugMode) at azurecopycommand.Program.Main(String[] args)

I think this is failing because the virtual directory destination '2016' doesn't exist, but am continuing to test.

Cheers, Simon

On 20 January 2016 at 12:30, Ken Faulkner notifications@github.com wrote:

Hi

I certainly haven't tried with volumes that large. Hmmm looking at the stack trace you supplied its the Amazon client library that is dying when trying to get the list of S3 blobs. The only thing I can suggest is perform multiple copies but targeting subdirectories.

ie if you have /media/photogroup1/.... and /media/photogroup2/.... then try 2 separate commands, one on each virtual directory.

I can investigate if there is an alternative way to get the list of blobs (maybe I could segment it within azurecopy itself), but that may take a little while to sort out.

Do you think performing multiple commands each targeting directories will work for you?

Thanks

Ken

On Wed, Jan 20, 2016 at 1:16 PM, Simon Horn notifications@github.com wrote:

Hi,

I've tried copying over the ~15TB worth of S3 files to an Azure target and received the debug error below There are ~6 million files in so maybe it was too much to handle? I saw the process use up to ~4GB of memory

If I target a much smaller virtual directory it works

PS C:\admin\azurecopy-115\azurecopy> \azurecopyexe -i https://s3amazonawscom/media/photos/ -o https://mediablobcorewindowsnet/photos -v -db -blobcopy GetHandler start GetHandler retrieved azurecopyS3Handler GetHandler start GetHandler retrieved azurecopyAzureHandler Unknown error generated Please report to Github page https://githubcom/kpfaulkner/azurecopy/issues Can view underly ing stacktrace by adding -db flag at AmazonRuntimeInternalUnmarshallerUnmarshall(IExecutionContext executionContext) at AmazonRuntimeInternalUnmarshallerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3ResponseHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalErrorHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalSignerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCredentialsRetrieverInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalRetryHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3KmsHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalEndpointResolverInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3PostMarshallHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalMarshallerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3PreMarshallHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3ExceptionHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalErrorCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalMetricsHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalRuntimePipelineInvokeSync(IExecutionContext executionContext) at AmazonRuntimeAmazonServiceClientInvokeTRequest,TResponse at AmazonS3AmazonS3ClientListObjects(ListObjectsRequest request) at azurecopyS3HandlerListBlobsInContainer(String containerName, String blobPrefix, Boolean debug) at azurecopycommandProgramGetSourceBlobList(IBlobHandler inputHandler) at azurecopycommandProgramDoNormalCopy() at azurecopycommandProgramProcess(Boolean debugMode) at azurecopycommandProgramMain(String[] args)

Cheers, Simon

— Reply to this email directly or view it on GitHub https://github.com/kpfaulkner/azurecopy/issues/7.

— Reply to this email directly or view it on GitHub https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173063967.

kpfaulkner commented 8 years ago

Hi

Try: azurecopy.exe -i https://s3.amazonaws.com/media/photos/2016/ -o https://media.blob.core.windows.net/photos/2016/

The trailing / is telling it that it should be considered a directory/container etc... and should be traversed. Otherwise it thinks its a blob.

Does that work for you?

Cheers

Ken

On Wed, Jan 20, 2016 at 1:39 PM, Simon Horn notifications@github.com wrote:

Thank Ken.

I've been looking at the since posting the Issue. The problem I'm now facing is that if I do something like: azurecopy.exe -i https://s3.amazonaws.com/media/photos/2016 -o https://media.blob.core.windows.net/photos/2016

..I get the following stack trace:

GetHandler start GetHandler retrieved azurecopy.S3Handler GetHandler start GetHandler retrieved azurecopy.AzureHandler Unknown error generated. Please report to Github page https://github.com/kpfaulkner/azurecopy/issues . Can view underly ing stacktrace by adding -db flag. at

Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ExecuteSync[T](RESTCommand`1 cmd, IRetryPolicy policy, Opera tionContext operationContext) at Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob.StartCopyFromBlob(Uri source, AccessCondition sourceAccessCondi tion, AccessCondition destAccessCondition, BlobRequestOptions options, OperationContext operationContext) at azurecopy.AzureBlobCopyHandler.StartCopy(BasicBlobContainer origBlob, String DestinationUrl, DestinationBlobType d estBlobType) at azurecopycommand.Program.DoNormalCopy() at azurecopycommand.Program.Process(Boolean debugMode) at azurecopycommand.Program.Main(String[] args)

I think this is failing because the virtual directory destination '2016' doesn't exist, but am continuing to test.

Cheers, Simon

On 20 January 2016 at 12:30, Ken Faulkner notifications@github.com wrote:

Hi

I certainly haven't tried with volumes that large. Hmmm looking at the stack trace you supplied its the Amazon client library that is dying when trying to get the list of S3 blobs. The only thing I can suggest is perform multiple copies but targeting subdirectories.

ie if you have /media/photogroup1/.... and /media/photogroup2/.... then try 2 separate commands, one on each virtual directory.

I can investigate if there is an alternative way to get the list of blobs (maybe I could segment it within azurecopy itself), but that may take a little while to sort out.

Do you think performing multiple commands each targeting directories will work for you?

Thanks

Ken

On Wed, Jan 20, 2016 at 1:16 PM, Simon Horn notifications@github.com wrote:

Hi,

I've tried copying over the ~15TB worth of S3 files to an Azure target and received the debug error below There are ~6 million files in so maybe it was too much to handle? I saw the process use up to ~4GB of memory

If I target a much smaller virtual directory it works

PS C:\admin\azurecopy-115\azurecopy> \azurecopyexe -i https://s3amazonawscom/media/photos/ -o https://mediablobcorewindowsnet/photos -v -db -blobcopy GetHandler start GetHandler retrieved azurecopyS3Handler GetHandler start GetHandler retrieved azurecopyAzureHandler Unknown error generated Please report to Github page https://githubcom/kpfaulkner/azurecopy/issues Can view underly ing stacktrace by adding -db flag at AmazonRuntimeInternalUnmarshallerUnmarshall(IExecutionContext executionContext) at AmazonRuntimeInternalUnmarshallerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3ResponseHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalErrorHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalSignerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCredentialsRetrieverInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalRetryHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3KmsHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalEndpointResolverInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3PostMarshallHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalMarshallerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3PreMarshallHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3ExceptionHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalErrorCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalMetricsHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalRuntimePipelineInvokeSync(IExecutionContext executionContext) at AmazonRuntimeAmazonServiceClientInvokeTRequest,TResponse at AmazonS3AmazonS3ClientListObjects(ListObjectsRequest request) at azurecopyS3HandlerListBlobsInContainer(String containerName, String blobPrefix, Boolean debug) at azurecopycommandProgramGetSourceBlobList(IBlobHandler inputHandler) at azurecopycommandProgramDoNormalCopy() at azurecopycommandProgramProcess(Boolean debugMode) at azurecopycommandProgramMain(String[] args)

Cheers, Simon

— Reply to this email directly or view it on GitHub https://github.com/kpfaulkner/azurecopy/issues/7.

— Reply to this email directly or view it on GitHub <https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173063967 .

— Reply to this email directly or view it on GitHub https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173065102.

nightwallaby commented 8 years ago

Ah forgot about that S3 trailing slash, thanks.

It looks like the Azure blobcopy needs at least that first virtual directory to already exist otherwise it fails. It seems to work fine without specifying the blobcopy option.

Cheers, Simon

On 20 January 2016 at 12:45, Ken Faulkner notifications@github.com wrote:

Hi

Try: azurecopy.exe -i https://s3.amazonaws.com/media/photos/2016/ -o https://media.blob.core.windows.net/photos/2016/

The trailing / is telling it that it should be considered a directory/container etc... and should be traversed. Otherwise it thinks its a blob.

Does that work for you?

Cheers

Ken

On Wed, Jan 20, 2016 at 1:39 PM, Simon Horn notifications@github.com wrote:

Thank Ken.

I've been looking at the since posting the Issue. The problem I'm now facing is that if I do something like: azurecopy.exe -i https://s3.amazonaws.com/media/photos/2016 -o https://media.blob.core.windows.net/photos/2016

..I get the following stack trace:

GetHandler start GetHandler retrieved azurecopy.S3Handler GetHandler start GetHandler retrieved azurecopy.AzureHandler Unknown error generated. Please report to Github page https://github.com/kpfaulkner/azurecopy/issues . Can view underly ing stacktrace by adding -db flag. at

Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ExecuteSync[T](RESTCommand`1 cmd, IRetryPolicy policy, Opera tionContext operationContext) at Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob.StartCopyFromBlob(Uri source, AccessCondition sourceAccessCondi tion, AccessCondition destAccessCondition, BlobRequestOptions options, OperationContext operationContext) at azurecopy.AzureBlobCopyHandler.StartCopy(BasicBlobContainer origBlob, String DestinationUrl, DestinationBlobType d estBlobType) at azurecopycommand.Program.DoNormalCopy() at azurecopycommand.Program.Process(Boolean debugMode) at azurecopycommand.Program.Main(String[] args)

I think this is failing because the virtual directory destination '2016' doesn't exist, but am continuing to test.

Cheers, Simon

On 20 January 2016 at 12:30, Ken Faulkner notifications@github.com wrote:

Hi

I certainly haven't tried with volumes that large. Hmmm looking at the stack trace you supplied its the Amazon client library that is dying when trying to get the list of S3 blobs. The only thing I can suggest is perform multiple copies but targeting subdirectories.

ie if you have /media/photogroup1/.... and /media/photogroup2/.... then try 2 separate commands, one on each virtual directory.

I can investigate if there is an alternative way to get the list of blobs (maybe I could segment it within azurecopy itself), but that may take a little while to sort out.

Do you think performing multiple commands each targeting directories will work for you?

Thanks

Ken

On Wed, Jan 20, 2016 at 1:16 PM, Simon Horn notifications@github.com wrote:

Hi,

I've tried copying over the ~15TB worth of S3 files to an Azure target and received the debug error below There are ~6 million files in so maybe it was too much to handle? I saw the process use up to ~4GB of memory

If I target a much smaller virtual directory it works

PS C:\admin\azurecopy-115\azurecopy> \azurecopyexe -i https://s3amazonawscom/media/photos/ -o https://mediablobcorewindowsnet/photos -v -db -blobcopy GetHandler start GetHandler retrieved azurecopyS3Handler GetHandler start GetHandler retrieved azurecopyAzureHandler Unknown error generated Please report to Github page https://githubcom/kpfaulkner/azurecopy/issues Can view underly ing stacktrace by adding -db flag at AmazonRuntimeInternalUnmarshallerUnmarshall(IExecutionContext executionContext) at AmazonRuntimeInternalUnmarshallerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3ResponseHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalErrorHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalSignerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCredentialsRetrieverInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalRetryHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3KmsHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalEndpointResolverInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3PostMarshallHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalMarshallerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3PreMarshallHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3ExceptionHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalErrorCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalMetricsHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalRuntimePipelineInvokeSync(IExecutionContext executionContext) at AmazonRuntimeAmazonServiceClientInvokeTRequest,TResponse at AmazonS3AmazonS3ClientListObjects(ListObjectsRequest request) at azurecopyS3HandlerListBlobsInContainer(String containerName, String blobPrefix, Boolean debug) at azurecopycommandProgramGetSourceBlobList(IBlobHandler inputHandler) at azurecopycommandProgramDoNormalCopy() at azurecopycommandProgramProcess(Boolean debugMode) at azurecopycommandProgramMain(String[] args)

Cheers, Simon

— Reply to this email directly or view it on GitHub https://github.com/kpfaulkner/azurecopy/issues/7.

— Reply to this email directly or view it on GitHub < https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173063967 .

— Reply to this email directly or view it on GitHub <https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173065102 .

— Reply to this email directly or view it on GitHub https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173066241.

kpfaulkner commented 8 years ago

Hi

Ahhh. If it's literally just that first vdir that it's having an issue with? I think I'll need to investigate virtual directories and blobcopy flag a bit more. See what is possible and what's problematic.

Is this a blocker for you?

Ken

On Wed, Jan 20, 2016 at 2:29 PM, Simon Horn notifications@github.com wrote:

Ah forgot about that S3 trailing slash, thanks.

It looks like the Azure blobcopy needs at least that first virtual directory to already exist otherwise it fails. It seems to work fine without specifying the blobcopy option.

Cheers, Simon

On 20 January 2016 at 12:45, Ken Faulkner notifications@github.com wrote:

Hi

Try: azurecopy.exe -i https://s3.amazonaws.com/media/photos/2016/ -o https://media.blob.core.windows.net/photos/2016/

The trailing / is telling it that it should be considered a directory/container etc... and should be traversed. Otherwise it thinks its a blob.

Does that work for you?

Cheers

Ken

On Wed, Jan 20, 2016 at 1:39 PM, Simon Horn notifications@github.com wrote:

Thank Ken.

I've been looking at the since posting the Issue. The problem I'm now facing is that if I do something like: azurecopy.exe -i https://s3.amazonaws.com/media/photos/2016 -o https://media.blob.core.windows.net/photos/2016

..I get the following stack trace:

GetHandler start GetHandler retrieved azurecopy.S3Handler GetHandler start GetHandler retrieved azurecopy.AzureHandler Unknown error generated. Please report to Github page https://github.com/kpfaulkner/azurecopy/issues . Can view underly ing stacktrace by adding -db flag. at

Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ExecuteSync[T](RESTCommand`1

cmd, IRetryPolicy policy, Opera tionContext operationContext) at

Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob.StartCopyFromBlob(Uri source, AccessCondition sourceAccessCondi tion, AccessCondition destAccessCondition, BlobRequestOptions options, OperationContext operationContext) at azurecopy.AzureBlobCopyHandler.StartCopy(BasicBlobContainer origBlob, String DestinationUrl, DestinationBlobType d estBlobType) at azurecopycommand.Program.DoNormalCopy() at azurecopycommand.Program.Process(Boolean debugMode) at azurecopycommand.Program.Main(String[] args)

I think this is failing because the virtual directory destination '2016' doesn't exist, but am continuing to test.

Cheers, Simon

On 20 January 2016 at 12:30, Ken Faulkner notifications@github.com wrote:

Hi

I certainly haven't tried with volumes that large. Hmmm looking at the stack trace you supplied its the Amazon client library that is dying when trying to get the list of S3 blobs. The only thing I can suggest is perform multiple copies but targeting subdirectories.

ie if you have /media/photogroup1/.... and /media/photogroup2/.... then try 2 separate commands, one on each virtual directory.

I can investigate if there is an alternative way to get the list of blobs (maybe I could segment it within azurecopy itself), but that may take a little while to sort out.

Do you think performing multiple commands each targeting directories will work for you?

Thanks

Ken

On Wed, Jan 20, 2016 at 1:16 PM, Simon Horn < notifications@github.com> wrote:

Hi,

I've tried copying over the ~15TB worth of S3 files to an Azure target and received the debug error below There are ~6 million files in so maybe it was too much to handle? I saw the process use up to ~4GB of memory

If I target a much smaller virtual directory it works

PS C:\admin\azurecopy-115\azurecopy> \azurecopyexe -i https://s3amazonawscom/media/photos/ -o https://mediablobcorewindowsnet/photos -v -db -blobcopy GetHandler start GetHandler retrieved azurecopyS3Handler GetHandler start GetHandler retrieved azurecopyAzureHandler Unknown error generated Please report to Github page https://githubcom/kpfaulkner/azurecopy/issues Can view underly ing stacktrace by adding -db flag at AmazonRuntimeInternalUnmarshallerUnmarshall(IExecutionContext executionContext) at AmazonRuntimeInternalUnmarshallerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3ResponseHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalErrorHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalSignerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCredentialsRetrieverInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalRetryHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3KmsHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalEndpointResolverInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at

AmazonS3InternalAmazonS3PostMarshallHandlerInvokeSync(IExecutionContext

executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalMarshallerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3PreMarshallHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3ExceptionHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalErrorCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalMetricsHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalRuntimePipelineInvokeSync(IExecutionContext executionContext) at AmazonRuntimeAmazonServiceClientInvokeTRequest,TResponse at AmazonS3AmazonS3ClientListObjects(ListObjectsRequest request) at azurecopyS3HandlerListBlobsInContainer(String containerName, String blobPrefix, Boolean debug) at azurecopycommandProgramGetSourceBlobList(IBlobHandler inputHandler) at azurecopycommandProgramDoNormalCopy() at azurecopycommandProgramProcess(Boolean debugMode) at azurecopycommandProgramMain(String[] args)

Cheers, Simon

— Reply to this email directly or view it on GitHub https://github.com/kpfaulkner/azurecopy/issues/7.

— Reply to this email directly or view it on GitHub < https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173063967 .

— Reply to this email directly or view it on GitHub < https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173065102 .

— Reply to this email directly or view it on GitHub <https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173066241 .

— Reply to this email directly or view it on GitHub https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173074612.

kpfaulkner commented 8 years ago

Hi

I have a partial fix for this. It's not an issue with blobcopy but merely how I use it for this particular scenario. Should be able to get you a patch later tonight.

Cheers

Ken

On Wed, Jan 20, 2016 at 2:35 PM, Ken Faulkner ken.faulkner@gmail.com wrote:

Hi

Ahhh. If it's literally just that first vdir that it's having an issue with? I think I'll need to investigate virtual directories and blobcopy flag a bit more. See what is possible and what's problematic.

Is this a blocker for you?

Ken

On Wed, Jan 20, 2016 at 2:29 PM, Simon Horn notifications@github.com wrote:

Ah forgot about that S3 trailing slash, thanks.

It looks like the Azure blobcopy needs at least that first virtual directory to already exist otherwise it fails. It seems to work fine without specifying the blobcopy option.

Cheers, Simon

On 20 January 2016 at 12:45, Ken Faulkner notifications@github.com wrote:

Hi

Try: azurecopy.exe -i https://s3.amazonaws.com/media/photos/2016/ -o https://media.blob.core.windows.net/photos/2016/

The trailing / is telling it that it should be considered a directory/container etc... and should be traversed. Otherwise it thinks its a blob.

Does that work for you?

Cheers

Ken

On Wed, Jan 20, 2016 at 1:39 PM, Simon Horn notifications@github.com wrote:

Thank Ken.

I've been looking at the since posting the Issue. The problem I'm now facing is that if I do something like: azurecopy.exe -i https://s3.amazonaws.com/media/photos/2016 -o https://media.blob.core.windows.net/photos/2016

..I get the following stack trace:

GetHandler start GetHandler retrieved azurecopy.S3Handler GetHandler start GetHandler retrieved azurecopy.AzureHandler Unknown error generated. Please report to Github page https://github.com/kpfaulkner/azurecopy/issues . Can view underly ing stacktrace by adding -db flag. at

Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ExecuteSync[T](RESTCommand`1

cmd, IRetryPolicy policy, Opera tionContext operationContext) at

Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob.StartCopyFromBlob(Uri source, AccessCondition sourceAccessCondi tion, AccessCondition destAccessCondition, BlobRequestOptions options, OperationContext operationContext) at azurecopy.AzureBlobCopyHandler.StartCopy(BasicBlobContainer origBlob, String DestinationUrl, DestinationBlobType d estBlobType) at azurecopycommand.Program.DoNormalCopy() at azurecopycommand.Program.Process(Boolean debugMode) at azurecopycommand.Program.Main(String[] args)

I think this is failing because the virtual directory destination '2016' doesn't exist, but am continuing to test.

Cheers, Simon

On 20 January 2016 at 12:30, Ken Faulkner notifications@github.com wrote:

Hi

I certainly haven't tried with volumes that large. Hmmm looking at the stack trace you supplied its the Amazon client library that is dying when trying to get the list of S3 blobs. The only thing I can suggest is perform multiple copies but targeting subdirectories.

ie if you have /media/photogroup1/.... and /media/photogroup2/.... then try 2 separate commands, one on each virtual directory.

I can investigate if there is an alternative way to get the list of blobs (maybe I could segment it within azurecopy itself), but that may take a little while to sort out.

Do you think performing multiple commands each targeting directories will work for you?

Thanks

Ken

On Wed, Jan 20, 2016 at 1:16 PM, Simon Horn < notifications@github.com> wrote:

Hi,

I've tried copying over the ~15TB worth of S3 files to an Azure target and received the debug error below There are ~6 million files in so maybe it was too much to handle? I saw the process use up to ~4GB of memory

If I target a much smaller virtual directory it works

PS C:\admin\azurecopy-115\azurecopy> \azurecopyexe -i https://s3amazonawscom/media/photos/ -o https://mediablobcorewindowsnet/photos -v -db -blobcopy GetHandler start GetHandler retrieved azurecopyS3Handler GetHandler start GetHandler retrieved azurecopyAzureHandler Unknown error generated Please report to Github page https://githubcom/kpfaulkner/azurecopy/issues Can view underly ing stacktrace by adding -db flag at AmazonRuntimeInternalUnmarshallerUnmarshall(IExecutionContext executionContext) at AmazonRuntimeInternalUnmarshallerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3ResponseHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalErrorHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalSignerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCredentialsRetrieverInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalRetryHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3KmsHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalEndpointResolverInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at

AmazonS3InternalAmazonS3PostMarshallHandlerInvokeSync(IExecutionContext

executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalMarshallerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3PreMarshallHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3ExceptionHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalErrorCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalMetricsHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalRuntimePipelineInvokeSync(IExecutionContext executionContext) at AmazonRuntimeAmazonServiceClientInvokeTRequest,TResponse at AmazonS3AmazonS3ClientListObjects(ListObjectsRequest request) at azurecopyS3HandlerListBlobsInContainer(String containerName, String blobPrefix, Boolean debug) at azurecopycommandProgramGetSourceBlobList(IBlobHandler inputHandler) at azurecopycommandProgramDoNormalCopy() at azurecopycommandProgramProcess(Boolean debugMode) at azurecopycommandProgramMain(String[] args)

Cheers, Simon

— Reply to this email directly or view it on GitHub https://github.com/kpfaulkner/azurecopy/issues/7.

— Reply to this email directly or view it on GitHub < https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173063967 .

— Reply to this email directly or view it on GitHub < https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173065102 .

— Reply to this email directly or view it on GitHub < https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173066241>.

— Reply to this email directly or view it on GitHub https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173074612 .

nightwallaby commented 8 years ago

Great thanks Ken.

On 20 January 2016 at 13:54, Ken Faulkner notifications@github.com wrote:

Hi

I have a partial fix for this. It's not an issue with blobcopy but merely how I use it for this particular scenario. Should be able to get you a patch later tonight.

Cheers

Ken

On Wed, Jan 20, 2016 at 2:35 PM, Ken Faulkner ken.faulkner@gmail.com wrote:

Hi

Ahhh. If it's literally just that first vdir that it's having an issue with? I think I'll need to investigate virtual directories and blobcopy flag a bit more. See what is possible and what's problematic.

Is this a blocker for you?

Ken

On Wed, Jan 20, 2016 at 2:29 PM, Simon Horn notifications@github.com wrote:

Ah forgot about that S3 trailing slash, thanks.

It looks like the Azure blobcopy needs at least that first virtual directory to already exist otherwise it fails. It seems to work fine without specifying the blobcopy option.

Cheers, Simon

On 20 January 2016 at 12:45, Ken Faulkner notifications@github.com wrote:

Hi

Try: azurecopy.exe -i https://s3.amazonaws.com/media/photos/2016/ -o https://media.blob.core.windows.net/photos/2016/

The trailing / is telling it that it should be considered a directory/container etc... and should be traversed. Otherwise it thinks its a blob.

Does that work for you?

Cheers

Ken

On Wed, Jan 20, 2016 at 1:39 PM, Simon Horn <notifications@github.com

wrote:

Thank Ken.

I've been looking at the since posting the Issue. The problem I'm now facing is that if I do something like: azurecopy.exe -i https://s3.amazonaws.com/media/photos/2016 -o https://media.blob.core.windows.net/photos/2016

..I get the following stack trace:

GetHandler start GetHandler retrieved azurecopy.S3Handler GetHandler start GetHandler retrieved azurecopy.AzureHandler Unknown error generated. Please report to Github page https://github.com/kpfaulkner/azurecopy/issues . Can view underly ing stacktrace by adding -db flag. at

Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ExecuteSync[T](RESTCommand`1

cmd, IRetryPolicy policy, Opera tionContext operationContext) at

Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob.StartCopyFromBlob(Uri source, AccessCondition sourceAccessCondi tion, AccessCondition destAccessCondition, BlobRequestOptions options, OperationContext operationContext) at azurecopy.AzureBlobCopyHandler.StartCopy(BasicBlobContainer origBlob, String DestinationUrl, DestinationBlobType d estBlobType) at azurecopycommand.Program.DoNormalCopy() at azurecopycommand.Program.Process(Boolean debugMode) at azurecopycommand.Program.Main(String[] args)

I think this is failing because the virtual directory destination '2016' doesn't exist, but am continuing to test.

Cheers, Simon

On 20 January 2016 at 12:30, Ken Faulkner <notifications@github.com

wrote:

Hi

I certainly haven't tried with volumes that large. Hmmm looking at the stack trace you supplied its the Amazon client library that is dying when trying to get the list of S3 blobs. The only thing I can suggest is perform multiple copies but targeting subdirectories.

ie if you have /media/photogroup1/.... and /media/photogroup2/.... then try 2 separate commands, one on each virtual directory.

I can investigate if there is an alternative way to get the list of blobs (maybe I could segment it within azurecopy itself), but that may take a little while to sort out.

Do you think performing multiple commands each targeting directories will work for you?

Thanks

Ken

On Wed, Jan 20, 2016 at 1:16 PM, Simon Horn < notifications@github.com> wrote:

Hi,

I've tried copying over the ~15TB worth of S3 files to an Azure target and received the debug error below There are ~6 million files in so maybe it was too much to handle? I saw the process use up to ~4GB of memory

If I target a much smaller virtual directory it works

PS C:\admin\azurecopy-115\azurecopy> \azurecopyexe -i https://s3amazonawscom/media/photos/ -o https://mediablobcorewindowsnet/photos -v -db -blobcopy GetHandler start GetHandler retrieved azurecopyS3Handler GetHandler start GetHandler retrieved azurecopyAzureHandler Unknown error generated Please report to Github page https://githubcom/kpfaulkner/azurecopy/issues Can view underly ing stacktrace by adding -db flag at AmazonRuntimeInternalUnmarshallerUnmarshall(IExecutionContext executionContext) at AmazonRuntimeInternalUnmarshallerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3ResponseHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalErrorHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalSignerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at

AmazonRuntimeInternalCredentialsRetrieverInvokeSync(IExecutionContext

executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalRetryHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3KmsHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalEndpointResolverInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at

AmazonS3InternalAmazonS3PostMarshallHandlerInvokeSync(IExecutionContext

executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalMarshallerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at

AmazonS3InternalAmazonS3PreMarshallHandlerInvokeSync(IExecutionContext

executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3ExceptionHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at

AmazonRuntimeInternalErrorCallbackHandlerInvokeSync(IExecutionContext

executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalMetricsHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalRuntimePipelineInvokeSync(IExecutionContext executionContext) at AmazonRuntimeAmazonServiceClientInvokeTRequest,TResponse at AmazonS3AmazonS3ClientListObjects(ListObjectsRequest request) at azurecopyS3HandlerListBlobsInContainer(String containerName, String blobPrefix, Boolean debug) at azurecopycommandProgramGetSourceBlobList(IBlobHandler inputHandler) at azurecopycommandProgramDoNormalCopy() at azurecopycommandProgramProcess(Boolean debugMode) at azurecopycommandProgramMain(String[] args)

Cheers, Simon

— Reply to this email directly or view it on GitHub https://github.com/kpfaulkner/azurecopy/issues/7.

— Reply to this email directly or view it on GitHub <

https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173063967

.

— Reply to this email directly or view it on GitHub < https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173065102 .

— Reply to this email directly or view it on GitHub < https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173066241 .

— Reply to this email directly or view it on GitHub < https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173074612> .

— Reply to this email directly or view it on GitHub https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173078234.

kpfaulkner commented 8 years ago

Hi

Have made the 1.1.6 PRE-release. Have tested a few cases locally, all works fine for me.

In this version I can do commands such as:

azurecopy -i https://xxxx.s3.amazonaws.com/newdir/dir2/ -o https://ddddd.blob.core.windows.net/temp/dir4/ -blobcopy

in this case "dir4" on azure doesn't exist yet. It will copy all of the content in the dir2 vdir on S3 to the dir4 vdir on Azure.

Hope that helps.

Ken

On Wed, Jan 20, 2016 at 4:04 PM, Simon Horn notifications@github.com wrote:

Great thanks Ken.

On 20 January 2016 at 13:54, Ken Faulkner notifications@github.com wrote:

Hi

I have a partial fix for this. It's not an issue with blobcopy but merely how I use it for this particular scenario. Should be able to get you a patch later tonight.

Cheers

Ken

On Wed, Jan 20, 2016 at 2:35 PM, Ken Faulkner ken.faulkner@gmail.com wrote:

Hi

Ahhh. If it's literally just that first vdir that it's having an issue with? I think I'll need to investigate virtual directories and blobcopy flag a bit more. See what is possible and what's problematic.

Is this a blocker for you?

Ken

On Wed, Jan 20, 2016 at 2:29 PM, Simon Horn notifications@github.com wrote:

Ah forgot about that S3 trailing slash, thanks.

It looks like the Azure blobcopy needs at least that first virtual directory to already exist otherwise it fails. It seems to work fine without specifying the blobcopy option.

Cheers, Simon

On 20 January 2016 at 12:45, Ken Faulkner notifications@github.com wrote:

Hi

Try: azurecopy.exe -i https://s3.amazonaws.com/media/photos/2016/ -o https://media.blob.core.windows.net/photos/2016/

The trailing / is telling it that it should be considered a directory/container etc... and should be traversed. Otherwise it thinks its a blob.

Does that work for you?

Cheers

Ken

On Wed, Jan 20, 2016 at 1:39 PM, Simon Horn < notifications@github.com

wrote:

Thank Ken.

I've been looking at the since posting the Issue. The problem I'm now facing is that if I do something like: azurecopy.exe -i https://s3.amazonaws.com/media/photos/2016 -o https://media.blob.core.windows.net/photos/2016

..I get the following stack trace:

GetHandler start GetHandler retrieved azurecopy.S3Handler GetHandler start GetHandler retrieved azurecopy.AzureHandler Unknown error generated. Please report to Github page https://github.com/kpfaulkner/azurecopy/issues . Can view underly ing stacktrace by adding -db flag. at

Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ExecuteSync[T](RESTCommand`1

cmd, IRetryPolicy policy, Opera tionContext operationContext) at

Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob.StartCopyFromBlob(Uri

source, AccessCondition sourceAccessCondi tion, AccessCondition destAccessCondition, BlobRequestOptions options, OperationContext operationContext) at azurecopy.AzureBlobCopyHandler.StartCopy(BasicBlobContainer origBlob, String DestinationUrl, DestinationBlobType d estBlobType) at azurecopycommand.Program.DoNormalCopy() at azurecopycommand.Program.Process(Boolean debugMode) at azurecopycommand.Program.Main(String[] args)

I think this is failing because the virtual directory destination '2016' doesn't exist, but am continuing to test.

Cheers, Simon

On 20 January 2016 at 12:30, Ken Faulkner < notifications@github.com

wrote:

Hi

I certainly haven't tried with volumes that large. Hmmm looking at the stack trace you supplied its the Amazon client library that is dying when trying to get the list of S3 blobs. The only thing I can suggest is perform multiple copies but targeting subdirectories.

ie if you have /media/photogroup1/.... and /media/photogroup2/.... then try 2 separate commands, one on each virtual directory.

I can investigate if there is an alternative way to get the list of blobs (maybe I could segment it within azurecopy itself), but that may take a little while to sort out.

Do you think performing multiple commands each targeting directories will work for you?

Thanks

Ken

On Wed, Jan 20, 2016 at 1:16 PM, Simon Horn < notifications@github.com> wrote:

Hi,

I've tried copying over the ~15TB worth of S3 files to an Azure target and received the debug error below There are ~6 million files in so maybe it was too much to handle? I saw the process use up to ~4GB of memory

If I target a much smaller virtual directory it works

PS C:\admin\azurecopy-115\azurecopy> \azurecopyexe -i https://s3amazonawscom/media/photos/ -o https://mediablobcorewindowsnet/photos -v -db -blobcopy GetHandler start GetHandler retrieved azurecopyS3Handler GetHandler start GetHandler retrieved azurecopyAzureHandler Unknown error generated Please report to Github page https://githubcom/kpfaulkner/azurecopy/issues Can view underly ing stacktrace by adding -db flag at AmazonRuntimeInternalUnmarshallerUnmarshall(IExecutionContext executionContext) at AmazonRuntimeInternalUnmarshallerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3ResponseHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalErrorHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalSignerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at

AmazonRuntimeInternalCredentialsRetrieverInvokeSync(IExecutionContext

executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalRetryHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3KmsHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalEndpointResolverInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at

AmazonS3InternalAmazonS3PostMarshallHandlerInvokeSync(IExecutionContext

executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalMarshallerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at

AmazonS3InternalAmazonS3PreMarshallHandlerInvokeSync(IExecutionContext

executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalCallbackHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonS3InternalAmazonS3ExceptionHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at

AmazonRuntimeInternalErrorCallbackHandlerInvokeSync(IExecutionContext

executionContext) at AmazonRuntimeInternalPipelineHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalMetricsHandlerInvokeSync(IExecutionContext executionContext) at AmazonRuntimeInternalRuntimePipelineInvokeSync(IExecutionContext executionContext) at AmazonRuntimeAmazonServiceClientInvokeTRequest,TResponse at AmazonS3AmazonS3ClientListObjects(ListObjectsRequest request) at azurecopyS3HandlerListBlobsInContainer(String containerName, String blobPrefix, Boolean debug) at azurecopycommandProgramGetSourceBlobList(IBlobHandler inputHandler) at azurecopycommandProgramDoNormalCopy() at azurecopycommandProgramProcess(Boolean debugMode) at azurecopycommandProgramMain(String[] args)

Cheers, Simon

— Reply to this email directly or view it on GitHub https://github.com/kpfaulkner/azurecopy/issues/7.

— Reply to this email directly or view it on GitHub <

https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173063967

.

— Reply to this email directly or view it on GitHub <

https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173065102

.

— Reply to this email directly or view it on GitHub <

https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173066241 .

— Reply to this email directly or view it on GitHub < https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173074612> .

— Reply to this email directly or view it on GitHub <https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173078234 .

— Reply to this email directly or view it on GitHub https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-173088983.

kpfaulkner commented 8 years ago

Any luck?

nightwallaby commented 8 years ago

Sorry for the delay. Yes that has worked thanks.

On 26 January 2016 at 21:33, Ken Faulkner notifications@github.com wrote:

Any luck?

— Reply to this email directly or view it on GitHub https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-174964300.

kpfaulkner commented 8 years ago

Hi Simon,

Cool, glad to hear it!

Ken

On Tue, Feb 16, 2016 at 2:02 PM, Simon Horn notifications@github.com wrote:

Sorry for the delay. Yes that has worked thanks.

On 26 January 2016 at 21:33, Ken Faulkner notifications@github.com wrote:

Any luck?

— Reply to this email directly or view it on GitHub <https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-174964300 .

— Reply to this email directly or view it on GitHub https://github.com/kpfaulkner/azurecopy/issues/7#issuecomment-184491811.