Closed fredrike closed 1 year ago
It is "normal" to see connection errors when using the docker-compose file like that.
Due to the slow startup of the elasticsearch container, and the really fast startup of py-image-dedup, the latter tries to connect to the db before it is up and running. To work around this py-image-dedup has a retry machanism, so waiting a couple of seconds usually lets the processing continue as normal.
The real fix would be to have the elasticsearch instance running before launching the py-image-dedup container, or setup some kind of dependency between the two inside of the docker-compose.yml
, which I am not sure it is even possible.
Since you do get logs about the "analyze" phase, I guess the order of the log output is just mixed up?
The real fix would be to have the elasticsearch instance running before launching the py-image-dedup container, or setup some kind of dependency between the two inside of the
docker-compose.yml
, which I am not sure it is even possible.
This is exactly what my docker-compose addition does:
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:9200"]
interval: 30s
timeout: 30s
retries: 10
depends_on:
elasticsearch:
condition: service_healthy
My issue is that after the initial phase elasticsearch becomes unresponsive:
py-image-dedup_1 | Phase 3/6: Analyzing files
py-image-dedup_1 | INFO:py_image_dedup.library.progress_manager:Analyzing files in '/mnt/source': 0% (1/179325)
py-image-dedup_1 | WARNING:elasticsearch:POST http://elasticsearch:9200/images/_doc?refresh=false [status:N/A request:0.021s]
py-image-dedup_1 | Traceback (most recent call last):
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 174, in _new_conn
py-image-dedup_1 | conn = connection.create_connection(
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/util/connection.py", line 95, in create_connection
py-image-dedup_1 | raise err
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
py-image-dedup_1 | sock.connect(sa)
py-image-dedup_1 | ConnectionRefusedError: [Errno 111] Connection refused
py-image-dedup_1 |
py-image-dedup_1 | During handling of the above exception, another exception occurred:
py-image-dedup_1 |
py-image-dedup_1 | Traceback (most recent call last):
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/elasticsearch/connection/http_urllib3.py", line 232, in perform_request
py-image-dedup_1 | response = self.pool.urlopen(
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 787, in urlopen
py-image-dedup_1 | retries = retries.increment(
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/util/retry.py", line 525, in increment
py-image-dedup_1 | raise six.reraise(type(error), error, _stacktrace)
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/packages/six.py", line 770, in reraise
py-image-dedup_1 | raise value
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen
py-image-dedup_1 | httplib_response = self._make_request(
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 398, in _make_request
py-image-dedup_1 | conn.request(method, url, **httplib_request_kw)
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 239, in request
py-image-dedup_1 | super(HTTPConnection, self).request(method, url, body=body, headers=headers)
py-image-dedup_1 | File "/usr/local/lib/python3.10/http/client.py", line 1282, in request
py-image-dedup_1 | self._send_request(method, url, body, headers, encode_chunked)
py-image-dedup_1 | File "/usr/local/lib/python3.10/http/client.py", line 1328, in _send_request
py-image-dedup_1 | self.endheaders(body, encode_chunked=encode_chunked)
py-image-dedup_1 | File "/usr/local/lib/python3.10/http/client.py", line 1277, in endheaders
py-image-dedup_1 | self._send_output(message_body, encode_chunked=encode_chunked)
py-image-dedup_1 | File "/usr/local/lib/python3.10/http/client.py", line 1037, in _send_output
py-image-dedup_1 | self.send(msg)
py-image-dedup_1 | File "/usr/local/lib/python3.10/http/client.py", line 975, in send
py-image-dedup_1 | self.connect()
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 205, in connect
py-image-dedup_1 | conn = self._new_conn()
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 186, in _new_conn
py-image-dedup_1 | raise NewConnectionError(
py-image-dedup_1 | urllib3.exceptions.NewConnectionError: : Failed to establish a new connection: [Errno 111] Connection refused
I cannot reproduce this. I don't use the docker-compose file in production, I use individual containers managed within a kubernetes cluster. Do you get the same behavior with the docker-compose file in this repo?
Thanks for getting back to me!
I'm currently running image-dedup
on an other folder with less number of images in. It seems that image-dedup did manage to write to elastic search for some of the images in the previous run:
py-image-dedup_1 | INFO:py_image_dedup.library.progress_manager:Cleanup database: 99% (2267/2289)
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-06-24/DSC00842.JPG
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-06-24/DSC00842.JPG
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-06-24/DSC00839.JPG
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-06-24/DSC00839.JPG
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-06-24/DSC00844.JPG
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-06-24/DSC00844.JPG
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-06-24/DSC00843.JPG
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-06-24/DSC00843.JPG
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-06-24/DSC00845.JPG
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-06-24/DSC00845.JPG
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-06-24/DSC00846.JPG
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-06-24/DSC00846.JPG
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-06-24/DSC00848.JPG
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-06-24/DSC00848.JPG
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-06-24/DSC00849.JPG
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-06-24/DSC00849.JPG
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-06-24/DSC00840.JPG
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-06-24/DSC00840.JPG
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-06-24/DSC00850.JPG
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-06-24/DSC00850.JPG
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-06-30/DSC00851.JPG
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-06-30/DSC00851.JPG
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-07-30/72546.poster.jpg
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-07-30/72546.poster.jpg
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-06-30/DSC00852.JPG
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-06-30/DSC00852.JPG
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-08-04/312505.banner.jpg
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-08-04/312505.banner.jpg
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-09-04/77398.banner.jpg
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-09-04/77398.banner.jpg
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-08-17/DSC00854.JPG
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-08-17/DSC00854.JPG
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-08-17/DSC00855.JPG
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-08-17/DSC00855.JPG
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-11-26/IMG_0210.jpg
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-11-26/IMG_0210.jpg
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-11-26/IMG_0211.jpg
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-11-26/IMG_0211.jpg
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-11-26/IMG_0213.jpg
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-11-26/IMG_0213.jpg
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-11-26/IMG_0212.jpg
py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-11-26/IMG_0212.jpg py-image-dedup_1 | DEBUG:py_image_dedup.util:Removing db entry for missing file: /mnt/source/2005-11-26/IMG_0218.jpg
py-image-dedup_1 | Removing db entry for missing file: /mnt/source/2005-11-26/IMG_0218.jpg
py-image-dedup_1 | INFO:py_image_dedup.library.progress_manager:Cleanup database: 100% (2289/2289)
py-image-dedup_1 | Phase 2/6: Counting files ...
py-image-dedup_1 | DEBUG:py_image_dedup.util:Phase 2/6: Counting files ...
py-image-dedup_1 | Phase 3/6: Analyzing files
py-image-dedup_1 | INFO:py_image_dedup.library.progress_manager:Counting files: 100% (1/1)
py-image-dedup_1 | DEBUG:py_image_dedup.util:Phase 3/6: Analyzing files
But after some time i get the same issue as before:
py-image-dedup_1 | INFO:py_image_dedup.library.progress_manager:Analyzing files in '/mnt/source': 0% (1/14899)
py-image-dedup_1 | WARNING:elasticsearch:POST http://elasticsearch:9200/images/_doc?refresh=false [status:N/A request:0.009s]
py-image-dedup_1 | Traceback (most recent call last):
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 174, in _new_conn
py-image-dedup_1 | conn = connection.create_connection(
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/util/connection.py", line 95, in create_connection
py-image-dedup_1 | raise err
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
py-image-dedup_1 | sock.connect(sa)
py-image-dedup_1 | ConnectionRefusedError: [Errno 111] Connection refused
py-image-dedup_1 |
py-image-dedup_1 | During handling of the above exception, another exception occurred:
py-image-dedup_1 |
py-image-dedup_1 | Traceback (most recent call last):
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/elasticsearch/connection/http_urllib3.py", line 232, in perform_request
py-image-dedup_1 | response = self.pool.urlopen(
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 787, in urlopen
py-image-dedup_1 | retries = retries.increment(
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/util/retry.py", line 525, in increment
py-image-dedup_1 | raise six.reraise(type(error), error, _stacktrace)
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/packages/six.py", line 770, in reraise
py-image-dedup_1 | raise value
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen
py-image-dedup_1 | httplib_response = self._make_request(
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 398, in _make_request
py-image-dedup_1 | conn.request(method, url, **httplib_request_kw)
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 239, in request
py-image-dedup_1 | super(HTTPConnection, self).request(method, url, body=body, headers=headers)
py-image-dedup_1 | File "/usr/local/lib/python3.10/http/client.py", line 1282, in request
py-image-dedup_1 | self._send_request(method, url, body, headers, encode_chunked)
py-image-dedup_1 | File "/usr/local/lib/python3.10/http/client.py", line 1328, in _send_request
py-image-dedup_1 | self.endheaders(body, encode_chunked=encode_chunked)
py-image-dedup_1 | File "/usr/local/lib/python3.10/http/client.py", line 1277, in endheaders
py-image-dedup_1 | self._send_output(message_body, encode_chunked=encode_chunked)
py-image-dedup_1 | File "/usr/local/lib/python3.10/http/client.py", line 1037, in _send_output
py-image-dedup_1 | self.send(msg)
py-image-dedup_1 | File "/usr/local/lib/python3.10/http/client.py", line 975, in send
py-image-dedup_1 | self.connect()
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 205, in connect
py-image-dedup_1 | conn = self._new_conn()
py-image-dedup_1 | File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 186, in _new_conn
py-image-dedup_1 | raise NewConnectionError(
py-image-dedup_1 | urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fc34affd810>: Failed to establish a new connection: [Errno 111] Connection refused
It is definitely an Elasticsearch issue:
elasticsearch_1 | {"type": "server", "timestamp": "2022-07-30T06:40:41,067Z", "level": "WARN", "component": "r.suppressed", "cluster.name": "do
cker-cluster", "node.name": "elasticsearch", "message": "path: /images/_search, params: {index=images}", "cluster.uuid": "VrXadE3iRXC2MvquiVfAtQ"
, "node.id": "Rnp6dtg9QsmDG0INGrREvg" ,
elasticsearch_1 | "stacktrace": ["org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed",
elasticsearch_1 | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:551) [elasticsear
ch-7.9.2.jar:7.9.2]",
elasticsearch_1 | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:309) [elasticse
arch-7.9.2.jar:7.9.2]",
elasticsearch_1 | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:582) [elasticsearch-
7.9.2.jar:7.9.2]",
elasticsearch_1 | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:393) [elasticsear
ch-7.9.2.jar:7.9.2]",
elasticsearch_1 | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.access$100(AbstractSearchAsyncAction.java:68) [elasticsearch-7.
9.2.jar:7.9.2]",
elasticsearch_1 | "at org.elasticsearch.action.search.AbstractSearchAsyncAction$1.onFailure(AbstractSearchAsyncAction.java:245) [elasticsearch-
7.9.2.jar:7.9.2]",
elasticsearch_1 | "at org.elasticsearch.action.search.SearchExecutionStatsCollector.onFailure(SearchExecutionStatsCollector.java:73) [elasticse
arch-7.9.2.jar:7.9.2]",
elasticsearch_1 | "at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:59) [elasticsea
rch-7.9.2.jar:7.9.2]",
elasticsearch_1 | "at org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleException(SearchTransportService.j
ava:403) [elasticsearch-7.9.2.jar:7.9.2]",
elasticsearch_1 | "at org.elasticsearch.transport.TransportService$6.handleException(TransportService.java:638) [elasticsearch-7.9.2.jar:7.9.2]
",
elasticsearch_1 | "at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1172) [e
lasticsearch-7.9.2.jar:7.9.2]",
elasticsearch_1 | "at org.elasticsearch.transport.TransportService$DirectResponseChannel.processException(TransportService.java:1281) [elastics
earch-7.9.2.jar:7.9.2]",
elasticsearch_1 | "at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1255) [elasticsearc
h-7.9.2.jar:7.9.2]",
elasticsearch_1 | "at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:61) [elasticsearch-7.9.2.jar:7.9.
2]",
elasticsearch_1 | "at org.elasticsearch.transport.TransportChannel.sendErrorResponse(TransportChannel.java:56) [elasticsearch-7.9.2.jar:7.9.2]"
,
elasticsearch_1 | "at org.elasticsearch.action.support.ChannelActionListener.onFailure(ChannelActionListener.java:51) [elasticsearch-7.9.2.jar:
7.9.2]",
...
...
elasticsearch_1 | "at org.elasticsearch.transport.TransportService$DirectResponseChannel.processException(TransportService.java:1281) [elastics
earch-7.9.2.jar:7.9.2]",
elasticsearch_1 | "at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1255) [elasticsearc
h-7.9.2.jar:7.9.2]",
elasticsearch_1 | "at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:61) [elasticsearch-7.9.2.jar:7.9.
2]",
elasticsearch_1 | "at org.elasticsearch.transport.TransportChannel.sendErrorResponse(TransportChannel.java:56) [elasticsearch-7.9.2.jar:7.9.2]"
,
elasticsearch_1 | "at org.elasticsearch.action.support.ChannelActionListener.onFailure(ChannelActionListener.java:51) [elasticsearch-7.9.2.jar:
7.9.2]",
elasticsearch_1 | "at org.elasticsearch.search.SearchService.lambda$runAsync$0(SearchService.java:414) [elasticsearch-7.9.2.jar:7.9.2]",
elasticsearch_1 | "at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:44) [elasticsearch-7.9.2.jar:7.9.2]",
elasticsearch_1 | "at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:737) [elasticsearch-7.9.2.jar:7.9.2]",
elasticsearch_1 | "at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.9.2.jar:7.9.2]",
elasticsearch_1 | "at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]",
elasticsearch_1 | "at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]",
elasticsearch_1 | "at java.lang.Thread.run(Thread.java:832) [?:?]"] }
py-image-dedup_1 | WARNING:elasticsearch:GET http://elasticsearch:9200/images/_search [status:503 request:0.077s]
elasticsearch_1 | {"type": "server", "timestamp": "2022-07-30T06:40:41,406Z", "level": "INFO", "component": "o.e.c.r.a.AllocationService", "cluster.name": "docker-cluster", "node.name": "elasticsearch", "message": "Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[images][0]]]).", "cluster.uuid": "VrXadE3iRXC2MvquiVfAtQ", "node.id": "Rnp6dtg9QsmDG0INGrREvg" py-image-dedup_elasticsearch_1 exited with code 137
I'll investigate some more.
I will close this issue due to inactivity. Feel free to reopen if needed.
Describe the bug I'm having issue with connections to my Elasticsearch instance.
The following is an example of the behaviour I'm experiences. I'm not sure how to handle it or if it is an issue with Elasticsearch or
py-image-dedup
.issue
To Reproduce Running with the following