Open buchanae opened 6 years ago
Thanks for pointing me here from #579 -- It seems my tables are created, but I am still getting an error during some code that has to do with creating tables. Pardon the formatting, this is copied from the Cloudwatch error log:
19:51:03 panic: runtime error: invalid memory address or nil pointer dereference
19:51:03 [signal SIGSEGV: segmentation violation code=0x1 addr=0xb8 pc=0xce7e0c]
19:51:03 goroutine 1 [running]:
19:51:03 github.com/ohsu-comp-bio/funnel/server/dynamodb.(*DynamoDB).tableIsAlive(0xc42008e700, 0x1f37b00, 0xc420290660, 0xc420296090, 0xb, 0xc420288750, 0x16) 19:51:03 /Users/strucka/go/src/github.com/ohsu-comp-bio/funnel/server/dynamodb/util.go:202 +0x1ac
19:51:03 github.com/ohsu-comp-bio/funnel/server/dynamodb.(*DynamoDB).waitForTables(0xc42008e700, 0x0, 0x0) 19:51:03 /Users/strucka/go/src/github.com/ohsu-comp-bio/funnel/server/dynamodb/util.go:213 +0xc4
19:51:03 github.com/ohsu-comp-bio/funnel/server/dynamodb.(*DynamoDB).createTables(0xc42008e700, 0x7ffca21dde43, 0x6)
19:51:03 /Users/strucka/go/src/github.com/ohsu-comp-bio/funnel/server/dynamodb/util.go:189 +0x1c5c
19:51:03 github.com/ohsu-comp-bio/funnel/server/dynamodb.NewDynamoDB(0x7ffca21dde43, 0x6, 0x0, 0x0, 0x7ffca21dde20, 0x9, 0x0, 0x0, 0x0, 0x0, ...)
19:51:03 /Users/strucka/go/src/github.com/ohsu-comp-bio/funnel/server/dynamodb/new.go:43 +0x3c3
19:51:03 github.com/ohsu-comp-bio/funnel/cmd/worker.NewWorker(0x1f37a80, 0xc420050480, 0xc42004e460, 0x1, 0x1, 0x7ffca21dde05, 0x8, 0x14e97cc, 0x5, 0x14ea0c7, ...) 19:51:03 /Users/strucka/go/src/github.com/ohsu-comp-bio/funnel/cmd/worker/run.go:53 +0xbf4 19:51:03 github.com/ohsu-comp-bio/funnel/cmd/worker.Run(0x1f37a80, 0xc420050480, 0xc42004e460, 0x1, 0x1, 0x7ffca21dde05, 0x8, 0x14e97cc, 0x5, 0x14ea0c7, ...) 19:51:03 /Users/strucka/go/src/github.com/ohsu-comp-bio/funnel/cmd/worker/run.go:22 +0x82
19:51:03 github.com/ohsu-comp-bio/funnel/cmd/worker.newCommandHooks.func2(0xc4203db8c0, 0xc42027e1e0, 0x0, 0xa, 0x0, 0x0)
19:51:03 /Users/strucka/go/src/github.com/ohsu-comp-bio/funnel/cmd/worker/worker.go:71 +0x2ca
19:51:03 github.com/ohsu-comp-bio/funnel/vendor/github.com/spf13/cobra.(*Command).execute(0xc4203db8c0, 0xc42027e640, 0xa, 0xa, 0xc4203db8c0, 0xc42027e640)
19:51:03 /Users/strucka/go/src/github.com/ohsu-comp-bio/funnel/vendor/github.com/spf13/cobra/command.go:746 +0x475
19:51:03 github.com/ohsu-comp-bio/funnel/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x1fd3460, 0x1fd4d20, 0xc4204fd200, 0x1fd4420)
19:51:03 /Users/strucka/go/src/github.com/ohsu-comp-bio/funnel/vendor/github.com/spf13/cobra/command.go:831 +0x30e
19:51:03 github.com/ohsu-comp-bio/funnel/vendor/github.com/spf13/cobra.(*Command).Execute(0x1fd3460, 0xc4204c9f70, 0x1071f3e)
19:51:03 /Users/strucka/go/src/github.com/ohsu-comp-bio/funnel/vendor/github.com/spf13/cobra/command.go:784 +0x2b
19:51:03 main.main()
19:51:03 /Users/strucka/go/src/github.com/ohsu-comp-bio/funnel/main.go:11 +0x2d
This is a bug from ignoring an error check, I think: https://github.com/ohsu-comp-bio/funnel/blob/master/database/dynamodb/util.go#L202
should be fixed in #580
OK, has that been merged in, and then do I need to check out the latest and build from source?
Thanks!
Yeah that will be quickest. Instructions to build from source can be found here: https://ohsu-comp-bio.github.io/funnel/download/
Still getting the error after I re-made the compute environment, job queue and job-def, with latest code. Is there something I can do to check my DynamoDB tables, if they are incorrectly set up??
Also, I had to modify the job definition because there was a "unknown flag EventWriter" error until I edited out the last part of the command that referenced "--EventWriter"
Thanks for pointing out that typo.
The compute environment, and job queue should not be impacted by which version of funnel you are running. For your job definition you should just need to update your image reference (if anything).
You are still getting the same panic error as before?
To debug dynamo I'd suggest the following:
Set up a local environment for your funnel server configured to use "local" compute, dynamodb as its database and try running a simple hello world task (funnel run --sh 'echo hello'
). The funnel server should be the process that creates the dynamodb tables so I don't necessarily think #579 is a bug.
I don't know what typo you are referring to, since I had to totally remove the flag and the value. I removed this text from the end of the job definition to get it working: " --EventWriter dynamodb --EventWriter log"
OK, that command ( funnel run --sh 'echo hello'
) runs fine locally, with DynamoDB backend. I also see rows for my failed runs in DynamoDB tables, but only the successful (locally run) task in the "stdout" table.
Thanks so much!!
The typo I was referring to was in our code. The generated command should have had --EventWriters
rather than --EventWriter
.
If your interested in using AWS Batch as a backend you may want to consider creating as custom AMI since the default one provided by amazon only has a few GB of storage available. I suggest either creating an AMI with a large fixed disk attached or one that enables dynamic mounting of the EBS volumes to VMs. I played around with the latter approach in https://github.com/adamstruck/ebsmount/tree/master/resources/funnel.
Let us know if you have any other issues! We can also be reached on gitter (https://gitter.im/ohsu-comp-bio/funnel).
Deployments of Funnel sometimes need to go through a few iterations of testing, including:
1) Curl (or funnel task list) some endpoint, get a 500 2) Check the funnel logs, or decipher the http response body 3) possibly repeat a few times
It would be nice to make this simple and documented, so that someone deploying funnel has a surefire approach to deployment. I think there's a couple improvements:
1) Better docs. A more complete guide to deployment. Our docs are sort of spread out. A more in-depth guide per environment (e.g. AWS + ELB + Dynamo + Batch) could be useful
2) A health check endpoint, which can be requested with
curl
. We can run a few checks and return user friendly logs and hints. Once these checks pass, the user can be confident their server is working.