matiasdelellis / facerecognition

Nextcloud app that implement a basic facial recognition system.
GNU Affero General Public License v3.0
507 stars 46 forks source link

SQLSTATE[HY000] MySQL server has gone away on "CreateClustersTask" #546

Open isaacolsen94 opened 2 years ago

isaacolsen94 commented 2 years ago

Not sure how to proceed in troubleshooting this issue. It happened twice for me and I verified my database is running. Everything else that relies on the database seems fine except this. Would you be able to help me diagnose this?

Expected behaviour

Background task should complete without errors

Actual behaviour

The task has failed multiple times at the 6/10 "CreateClustersTask"

image

Server configuration

matiasdelellis commented 2 years ago

Hi @isaacolsen94 It seems, you have so many faces, and some queries are slow. 😞 Try: https://stackoverflow.com/questions/7942154/mysql-error-2006-mysql-server-has-gone-away

isaacolsen94 commented 2 years ago

That did the trick! I also doubled the values of a few of the timeouts/cache settings and it was able to complete properly. Thank you so much for your help!!!

JojoDevel commented 2 years ago

@isaacolsen94 Do you remember what settings you have changed?

JojoDevel commented 2 years ago

Adding this to my mariadb config solved the issue

innodb_log_file_size=512M
isaacolsen94 commented 2 years ago

I didn't remember which options I changed, but I will give this one a try! Thanks for sharing your fix!

I think I went through the recommendations in the mentioned article and doubled each resource call out. Probably wasn't the best move but it seemed to work. But I haven't verified recently.

ftrentini commented 2 years ago

This can be solved by adjusting the following variable in mariadb config:

wait_timeout=86400

It means MySQL will wait for 24h your process to be done before giving up. Increase at will (up to 31536000). Just to illustrate, I'm testing my server right now, and it's been taking 8h to process 150K faces. So far! :laughing:

jojo221119 commented 1 week ago

I got the same error increased already the max_packet_allowed value to 5G but still face it. Timeouts are all way bigger.

The clustering process dies with the following error after ~2h.

`In Exception.php line 28:

[Doctrine\DBAL\Driver\PDO\Exception (2006)]
SQLSTATE[HY000]: General error: 2006 MySQL server has gone away `

None of the timeouts is around 2h they are all much bigger. Also the process breaks not exactly after 2h.

Any hints on how to troubleshoot further would be welcome.