Closed alirezaImani-f4L3e closed 6 months ago
This smells like a queue config issue. If ES is down all that will happen is that it will throw an exception (No alive nodes) & queues should catch these.
Are you using Horizon and Redis? Do you have supervisor
in place to start up again if it goes down?
If you can point to a specific touch point that is causing this as a side effect because of the package then happy to continue here, else will close this in a few days time under the assumption that it falls out of the scope of this package.
I suggest asking the StackOverflow community for a hand using the queue issue as a starting point.
Also, consider upgrading to the 3.8.x version of the package - probably won't help you with this issue, but it's now the maintained version and far more advanced. Good luck!
Yes we are using horizon and redis . and we have a supervisor to bring back the ES service to work .
When ES is down we are getting (No alive nodes) exception and this is the expected behavior . but when we bring back up the ES service we are still getting (No alive nodes) exception .
I was searching about this issue and i faced a topic about guzzlehttp
and CURLOPT_FORBID_REUSE
and I think its probably responsible for this issue that elasticsearch package is reusing the connection that we have using when ES service is down .
Also Im trying to examine this issue on 3.8.x ....
I have tested v3.8.x and get same result and the issue is still valid .
example-laravel-elasticsearch(laravel 11.x and laravel-elasticsearch 3.11.x)
I have been created an example to show the problem . please run this project using docker compose file . (elasticsearch and kibana services included in docker-compose)
After running migrations of the app you can hit the /
route to test inserting to elasticsearch .
We have simple job that takes a name and insert it to an index in the elasticsearch .
Then try to stop the elasticsearch service , hit the /
route to create the job in the queue (that the job will fail ), and then start the elasticsearch again and you still getting (no alive node available) exception .
Can you try simulate this by using the ES PHP client directly?
https://github.com/elastic/elasticsearch-php
Just make job that runs a connection and query manually, then fail the container etc.
Let me know what you find
I have been tested elasticsearch-php package and it was working fine after ES service goes up again .
Last thing I'd like you to try before I commit time to this. Please try simulating the same as native MySQL. And if possible using mongoDB and https://github.com/mongodb/laravel-mongodb
Horizon caches most laravel settings and doesn't clear it until you restart Horizon. Could that point to something?
I tried Mysql before(stop mysql service and start it) and its working correctly with queued jobs .
I think there is problem with reconnection functionality in Illuminate\Database\Connection
that implemented for Pdo and we are not handling reconnection in the package .
Also i have tried laravel-mongodb package and its working perfectly in queued job .
I have updated sample-project
implemented these three samples in InsertToElastic job :
Hey @alirezaImani-f4L3e - tried and cannot recreate the issue you're facing. Retrying failed jobs works for me:
Different to your sample I used predis
though:
composer require predis/predis
and
REDIS_CLIENT=predis
CACHE_DRIVER=redis
QUEUE_CONNECTION=redis
SESSION_DRIVER=redis
Try that, maybe it's something but either way since we're both getting different results with the same code then this must be an environment issue. Can't be sure until I can see it. Sorry man
Just a side note, if you're deferring these to a job queue because these writes are slow, keep in mind that you can save/create without refreshing if you will not be working with that record immediately after. Speed is near instant.
See: https://elasticsearch.pdphilip.com/saving-models#fast-saves
But seeing your ES containers are going down perhaps queuing is safer
It's not working even with predis
.
Which command have you used ? (queue:listen or queue:work)
queue:listen is working fine because it's not caching the job code . we have problem with queue:work .
Can you share your example ?
php artisan horizon
in local and prod
In prod I keep it running via supervisor
settings
If you don't mind , try starting workers with queue:work .
So that seems to cache the connection and lock that in when it fails. I have no idea why it would do that.
If I rebuild the connection on every request then that solves this [very] specific issue. However, in doing so it slowed down my tests run time by between 5-15%, which is significant. Given that, it's not viable to incur the performance cost to cover this edge case.
I'll leave this open anyway and look at building in a circuit breaker for failed calls. That should fix this issue without compromising performance, but will need a significant upgrade with how the bridge handles errors.
In the meantime, I suggest running Horizon directly and config a supervisor: https://laravel.com/docs/11.x/horizon#supervisor-configuration - assuming it's in scope of your project of course
@alirezaImani-f4L3e - Am bundling this in with the next release. Can you update dev branch and try again?
It should work now. Let me know so that I know to close this on release notes. Thanks.
Yes now it's working fine . Thank you for your attention .
Good job.
Will you apply this fix in 3.8.x in next release ?
Indeed 👍
Hi @alirezaImani-f4L3e - 3.8.1
has been released with a fix for this
Package version
Im using v2.8.7 of the package
Describe the bug
Im using this package to insert the http request info of every request that hits the application to elasticsearch and for that im using packae in a queued job . in normal situations everything is working fine . but when elasticsearch container goes down for a moment while users are using the application and again comes up to work we still has connection failed to elasticsearch from queued job . for a temporary solution we have to run
queue:restart
to fix this .I wonder if its a bug in this package or in my application code .
To Reproduce
1- create a queued job and within that job try to insert some data to elasticsearch . 2- stop the elasticsearch container . 3- try dispathing the job and you will get error in queue , inserting to elasticsearch . 4- start the container again . 5- you are still getting failed despite that the elasticsearch container works fine .
Expected behavior
We expect that application continue to inserting request logs to elasticsearch after elasticsearch container continue to work correctly .