Make startup problems more debugging friendly

robertbindar commented 3 years ago

We should try to print better errors in the logs when either the MariaDB client or the server fails to start. There are corner cases when a user passes some configuration options in MariaDB .my.cnf file and the server for instance just fails to start. It's really hard to debug such scenarios with the current error logs, we should aim to include the output from the server/client as well.

JohhnyKinuthia commented 3 years ago

Hi @robertbindar, I would like to solve and complete this issue.

robertbindar commented 3 years ago

Sure @JohhnyKinuthia, thanks for being interested in this task! Let me know if you have questions or you need guidance through the codebase.

jonakarl commented 3 years ago

Hi @robertbindar and @JohhnyKinuthia :

In our development for a (for now in house) solution for #19 we took a look at the code for starting the test server and we noticed some possible improvements.

The code polls for a specific string and if found it sets a internal variable to indicate that the server has started.
Similar when the server is stopped.

I think a better approach would be the check if the process is still alive, eg:

if self.server.poll() is None:
   # server is alive
else
  # server is dead

This will also take care if someone outside the (eg by the terminal) kill the mysld process or that dies for other reasons. There are still some outstanding questions, like if multiple notebooks is started, only the last kills the process, is there a counter or similar for this? We will try to submit a short PR during the day for this specific task .

robertbindar commented 3 years ago

Hey @jonakarl! I will review your PR today, but I will first answer the question here as it make more sense.

Polling whether the process is still alive is a great idea, much cleaner than mine, so we should definitely go with this. Polling for the string was used to check if the MariaDB server is in a clean enough state to accept client connections when starting, and for stopping it doesn't make sense (I guess it was easier to just call the _wait_server function again, I don't remember a good reason for why I did that). Unfortunately polling for the string is still needed even though we poll for process still alive with .poll, but I don't like it, I will try to find a smarter way to see if the server is in a READY state. But this won't affect your PR, this is some followup work I will do. .poll() for shutting down the server is for sure the way to go.

About multiple notebooks. Because a new kernel instance is created for each notebook by jupyter, the server_up variable is True only for the kernel that starts the MariaDB server, so that is the only kernel instance responsible for killing the server. This was my thinking when I wrote that part of code. But you are right! A corner case could be: a student opens 2 notebooks. The first one starts the server, the second one connects to the already spinning server. But the student closes the first notebook, which kills the server. Thus the second notebooks remains hanging. This is a great find, thanks!

We most probably don't have a way to share state between kernel instances from multiple notebooks, but an idea would be to store such state within the server itself. I will need to think a bit more about this.

robertbindar commented 3 years ago

Most of this issue was solved by @jonakarl already, I'll add some followup things shortly and afterwards it's fine to close this.

robertbindar commented 3 years ago

Added "debug" option in ClientConfig to allow the user to change the logging level to DEBUG when in trouble (0dddfe5cd8c3c9ad80687fed05401cc5313e6d86). With this included, the issue can be safely closed.

MariaDB / mariadb_kernel

Make startup problems more debugging friendly #17