aws / aws-sdk-go

AWS SDK for the Go programming language.
http://aws.amazon.com/sdk-for-go/
Apache License 2.0
8.61k stars 2.07k forks source link

Panic in upload.go #1023

Closed mmacdermaid closed 7 years ago

mmacdermaid commented 7 years ago

https://github.com/aws/aws-sdk-go/blob/master/service/s3/s3manager/upload.go#L514

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x6b8cf7]
goroutine 1976971578 [running]:
panic(0x875ae0, 0xc4200100f0)
    /opt/golang/go1.7/src/runtime/panic.go:500 +0x1a1
github.com/aws/aws-sdk-go/service/s3/s3manager.(*multiuploader).upload(0xcbe5dfdaa0, 0xb38fa0, 0xc73728d0e0, 0x500000, 0x0, 0x0)
    /go/src/github.com/aws/aws-sdk-go/service/s3/s3manager/upload.go:514 +0x1d7
github.com/aws/aws-sdk-go/service/s3/s3manager.(*uploader).upload(0xc5939d3590, 0xc5939d3590, 0x0, 0xc420042c00)
    /go/src/github.com/aws/aws-sdk-go/service/s3/s3manager/upload.go:368 +0x528
github.com/aws/aws-sdk-go/service/s3/s3manager.Uploader.Upload(0x500000, 0x5, 0x0, 0x2710, 0xb4a5c0, 0xc478de6b68, 0xc64672d520, 0x0, 0x0, 0x0, ...)
    /go/src/github.com/aws/aws-sdk-go/service/s3/s3manager/upload.go:336 +0x129
xibz commented 7 years ago

Hello @mmacdermaid, thank you for reaching out to us. It looks like something funky is going on the service side. Can you enable logging to see what exactly is being sent back?

uploader.S3 := s3.New(session.New((&aws.Config{                                                                            
    // Config params                                                                                    
  }).WithLogLevel(aws.LogDebugWithHTTPBody | aws.LogDebug | aws.LogDebugWithSigning)))
mmacdermaid commented 7 years ago

Is there a better way to investigate this? This is not a 100% occurrence. It seems to happen after server uptime hits around 3-4 days. With the amount of data running through a single box those log flags produce over 50MB/s of logs. It also looks like they contain auth/security keys which I would prefer not posting here into a github ticket.

xibz commented 7 years ago

@mmacdermaid, I agree that you definitely don't want to post that sensitive information here. However, all we need is the bit that pertains to the panic. I would then also remove any sensitive information in the logging and then post it here.

What I need minimally is:

With that I think we can at least diagnose what is going on.

mmacdermaid commented 7 years ago

I will see what I can do to try and get you the data, but it might take a few days given that the panic seems sporadic. I wont be able to provide the post body since I also don't believe I can share this with you based on our company data restrictions. I can tell you it's LZOP archive about ~25mb in size though.

xibz commented 7 years ago

Okay @mmacdermaid, can you also see if the body contains the correct data? Like missing any fields. That panic is a result of a field not being populated. So my guess is, perhaps, the body may be empty.

mmacdermaid commented 7 years ago

That panic is a result of a field not being populated.

uploader        *s3manager.Uploader
body              io.Reader
    ui := &s3manager.UploadInput{
        Bucket: &bucketName,
        Key:    &key,
        Body:   body,
    }

    _, err := u.uploader.Upload(ui)

You mean like Body: missing?

I have no reason to believe body is empty, nil or len() == 0. We have alerts and logs surrounding the callers of this section to make sure that doesn't happen, and if it does we know about it (I have no logs of this kind during the panic)

The body is non-specific zipped data, I wouldn't be able to confirm it's "correct" other than it being valid json, and that wouldn't be an easy task given the log size.

xibz commented 7 years ago

Sorry, I was talking about the body on the response is missing from the service.

We need something to narrow the search of possibilities. So, you got the output for the panic, could you not just look at the last thing that way logged? If this isn't an option, perhaps a custom logger to log to a buffer and clear the buffer when a new request is started by using the build handler. If a panic occurs, write to a file. Does that sound feasible?

Please let me know

mmacdermaid commented 7 years ago

Is there a way I could log only the body of the return response, and not the outgoing body (LogDebugWithHTTPBody looked like it was logging both?)

xibz commented 7 years ago

@mmacdermaid , instead of enabling logging, you can use the handlers.

  req.Handlers.Send.PushBack(func(r *request.Request) {                                                                    
    dumpedBody, err := httputil.DumpResponse(r.HTTPResponse, true)                                                         
    if err != nil {                                                                                                        
      fmt.Println("Error:", err)
      return                                                                           
    }                                                                                                                      
    fmt.Println(string(dumpedBody))                                                                                        
  }) 

Let me know, if that helps!

mmacdermaid commented 7 years ago

I will try when I get a chance.

To give you an update @xibz. I wont get to this for a little bit. Since the panic exists and is affecting current customers (and future deploys) I need to write some code to use the standard POST to HTTP aws, and stop using the golang uploader.

After I get that code deployed I will stick a test version of the SDK back on a single box and help you debug this panic.

Thanks for your help. Talk to you soon.

xibz commented 7 years ago

Hello @mmacdermaid, I am going to go ahead and close this. If you are still having issues, please feel free to reopen this. Cheers!

disq commented 7 years ago

We also encountered this today. Latest AWS SDK (1.7.6), Go 1.7, linux/amd64.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4a8ba7]

goroutine 203 [running]:
panic(0x83b700, 0xc4200100a0)
    /usr/local/go/src/runtime/panic.go:500 +0x1a1
github.com/peakgames/s5cmd/vendor/github.com/aws/aws-sdk-go/service/s3/s3manager.(*multiuploader).upload(0xc4206158c0, 0xafa060, 0xc4205df470, 0x500000, 0x0, 0x0)
    /home/user/workspace/src/github.com/peakgames/s5cmd/vendor/github.com/aws/aws-sdk-go/service/s3/s3manager/upload.go:514 +0x1d7
github.com/peakgames/s5cmd/vendor/github.com/aws/aws-sdk-go/service/s3/s3manager.(*uploader).upload(0xc4202b74a0, 0xc4202b74a0, 0xc420336000, 0x0)
    /home/user/workspace/src/github.com/peakgames/s5cmd/vendor/github.com/aws/aws-sdk-go/service/s3/s3manager/upload.go:368 +0x528
github.com/peakgames/s5cmd/vendor/github.com/aws/aws-sdk-go/service/s3/s3manager.Uploader.Upload(0x500000, 0x5, 0x0, 0x2710, 0xb09d80, 0xc4203ea028, 0xc42019a1a0, 0xc420c3ff50, 0x1, 0x1, ...)
    /home/user/workspace/src/github.com/peakgames/s5cmd/vendor/github.com/aws/aws-sdk-go/service/s3/s3manager/upload.go:336 +0x129
github.com/peakgames/s5cmd/core.(*Job).Run.func11(0xc4202205a0, 0xc42038a000, 0xc4207740c0, 0xc420615860, 0x500000)
    /home/user/workspace/src/github.com/peakgames/s5cmd/core/job.go:707 +0x2ce
created by github.com/peakgames/s5cmd/core.(*Job).Run
    /home/user/workspace/src/github.com/peakgames/s5cmd/core/job.go:711

This is how it was called: https://github.com/peakgames/s5cmd/blob/40b0fbe6115b5cfce73682870b0570e69784d473/core/job.go#L700-L707

xibz commented 7 years ago

Hello @disq, thank you for reaching out to us. Is this pretty consistent? Enabling logs may give us a better idea to what is going on.

uploader.S3 := s3.New(session.New((&aws.Config{                                                                            
    // Config params                                                                                    
  }).WithLogLevel(aws.LogDebugWithHTTPBody | aws.LogDebug | aws.LogDebugWithSigning)))
disq commented 7 years ago

It's not consistent / we were not able to reproduce it. I'll try enabling logging for now.