wandb / server

W&B Server is the self hosted version of Weights & Biases
MIT License
254 stars 21 forks source link

Failed to call FetchSessionByID(1) #124

Closed wongxinjie closed 1 year ago

wongxinjie commented 1 year ago

I encountered the following error when deploying the local Wandb server using Docker (tag: 0.36.1), which appears to be a bug in the program.

"program":"gorilla","source":"mnt/ramdisk/core/services/gorilla/logging_gen.go:3900","pid":1061},"data":{"id":1},"message":"Failed to call FetchSessionByID(1): Error 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'rror in your SQL syntax; check the manual that corresponds to your MySQL server ' at line 1",

wongxinjie commented 1 year ago

The issue occurs when inviting users via email and they accept the invitation.

sydholl commented 1 year ago

WandB Internal User commented: wongxinjie commented: The issue occurs when inviting users via email and they accept the invitation.

umakrishnaswamy commented 1 year ago

@wongxinjie Sorry to hear you're encountering this. Could you send over the debug bundle associated with this instance? You can get it by going to your System Settings page > top right corner > debug bundle.

Additionally, does this occur with upgrades of the instance? 0.36.1 is an older server release, so I am wondering if this behavior is occurring for you with any of the newer ones as well.

sydholl commented 1 year ago

WandB Internal User commented: umakrishnaswamy commented: @wongxinjie Sorry to hear you're encountering this. Could you send over the debug bundle associated with this instance? You can get it by going to your System Settings page > top right corner > debug bundle.

Additionally, does this occur with upgrades of the instance? 0.36.1 is an older server release, so I am wondering if this behavior is occurring for you with any of the newer ones as well.

sydholl commented 1 year ago

Uma Krishnaswamy commented: Hi @wongxinjie,

We wanted to follow up with you regarding your support request as we have not heard back from you. Please let us know if we can be of further assistance or if your issue has been resolved.

sydholl commented 1 year ago

Uma Krishnaswamy commented: Hi @wongxinjie, since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!

wongxinjie commented 1 year ago

I have upgraded the image version to 0.41.0, but the issue still persists. I deployed using k8s and use an external MySQL instance with version 5.7.29. The issue can be reproduced by adding a new user in the admin account, then entering the Docker container and using "/usr/local/bin/local password" to set a password for the new user. After successfully login with the new user account, logout and then login again with the admin account, the bug will occur. Alternatively, using the new user's API_KEY to report metrics will also trigger this error. I suspect it is related to the session of the account.

wongxinjie commented 1 year ago

Is it because the local version wandb cannot use an external MySQL instance?

umakrishnaswamy commented 1 year ago

Hey @wongxinjie, few questions to help me dig into this:

umakrishnaswamy commented 1 year ago

@wongxinjie We do support running your own MySQL database. Documentation can be found here

wongxinjie commented 1 year ago

This bug has also occurred during the normal registration process. When we send an invitation email to a user, they can create an account and log in normally, but an error occurs when they enter the page. In some cases, we do not want to send an email to the user, but instead want to set the password directly for them. For example, we add the user "debug@it.com" to the dashboard, and then use "/usr/local/bin/local password debug@it.com" to set the password for "debug@it.com" as "it1234567". Then, when we log in with the "debug@it.com" account in the same browser, the issue can be reproduced.

wongxinjie commented 1 year ago

Two issues. One is that the UI will crash after adding a new user, but it will recover after about 30 minutes. The other issue is a new one that we encountered today. After configuring an external Redis instance, reporting metrics will result in an error like following:

{"error":"Error 1105 (HY000): argument mismatch, need 1 but got 0"}
500 response executing GraphQL.
{"error":"Error 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '%d _in_at,\n\t\t\t\t\tuser_info,\n\t\t\t\t\thide_teams_from_public,\n\t\t\t\t\tonboarding_steps,\n\t' at line 1"}
500 response executing GraphQL.
{"error":"Error 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '%d _in_at,\n\t\t\t\t\tuser_info,\n\t\t\t\t\thide_teams_from_public,\n\t\t\t\t\tonboarding_steps,\n\t' at line 1"}
wongxinjie commented 1 year ago

Issue resolved. It was due to a problem with the version of the MySQL database.