benchflow / collectors

Collectors microservices utilised to collect relevant data from containers
Other
1 stars 1 forks source link

Solve stats "Done" channel error #23

Open Cerfoglg opened 8 years ago

Cerfoglg commented 8 years ago

In the stats collector, we are using this golang API for connecting to Docker and retrieving the stats: https://github.com/fsouza/go-dockerclient

In the code, we use this function to retrieve the stats: https://godoc.org/github.com/fsouza/go-dockerclient#Client.Stats

The function is blocking, meaning that once started the routine can't be exited unless the function is stopped. It's possible to stop the function by signalling on the Done channel, which can be passed to the function inside the StatsOpts structure. Theoretically, sending a boolean to the channel should interrupt the function, however when attempting to do so an error is returned that states the channel is closed. It's possible this is an issue with the API itself.

What needs to be done:

Related issues: https://github.com/benchflow/collectors/issues/1, https://github.com/benchflow/collectors/issues/4, https://github.com/benchflow/monitors/issues/1,

Cerfoglg commented 8 years ago

It's very likely that the API is the issue here. I've tested it in the simplest way possible, and it still gives me the error. I reported this on the github of the API: https://github.com/fsouza/go-dockerclient/issues/423

VincenzoFerme commented 8 years ago

@Cerfoglg since they are not replying lets proceed with a different approach to stop the execution of the blocking go routine. We then update this issue and keep it open waiting for the response of for future investigations by our side. Maybe here we found some hints: http://blog.labix.org/2011/10/09/death-of-goroutines-under-control

Cerfoglg commented 8 years ago

@VincenzoFerme It appears that the done channel does work in stopping the function, but the Stats function returns an error when stopped instead of quitting gracefully. It's easy to work around that, so the problem is solved now, although it would be nice to not have the function give that error when exiting. I added a comment on the API's issue I opened.

VincenzoFerme commented 8 years ago

@Cerfoglg ok, lets wait for an answer and see if it can be improved. It seems enough for now, as solved in https://github.com/benchflow/monitors/pull/10