Closed mike391 closed 5 years ago
Ok, looks like it's not starting the working bots:
Starting 8 worker bots in background... ERROR starting bot, check redis and ES are running and diskover.cfg settings.
Appears to be caused by this:
python: can't open file './diskover_worker_bot.py': [Errno 2] No such file or directory
Almost like paths inside the container are broken or something similar.
Doing some more digging.
Ok, this is due to changes with Diskover with the 1.5.0.3 release. A check was added to the diskover-bot-launcher.sh
file which is throwing the error seen in my last post:
# check if bot started
if [ $i -eq 1 ]; then
sleep 1
ps -p $! > /dev/null 2>&1
if [ $? -gt 0 ]; then
echo "ERROR starting bot, check redis and ES are running and diskover.cfg settings."
exit 1
fi
fi
Maintaining all of the changes on the LSIO side, but hardcoding the 1.5.0.2 releases of Diskover, and then building my own image, seems to resolve the issue. I left the Diskover-web release stuff the same as that does not contain any changes that seem to have broken this.
My image is tronyx/diskover
if you'd like to test it @mike391.
I've confirmed that it is just the above check as I updated my Docker image to use the latest release of Diskover again, but then commented out the above lines and, after doing that, everything runs as expected, so none of the other changes made with the 1.5.0.3 release appear to be a factor in this issue.
Thanks for looking into this! Sorry I didnt include details, Ive been gone for a few days and when I found the bug I wanted to mention it asap since a new build was released.
Using your new image works great! Its finally indexing again, ill let you know if any hiccups occur but so far it looks good.
this error
python: can't open file './diskover_worker_bot.py': [Errno 2] No such file or directory
is not caused by the bot process check you mentioned above, this would be from a path configuration issue on line 17 in diskover-bot-launcher.sh... v1.6.1 of diskover-bot-launcher.sh I added in a check if the bot starts and output the error if it doesn't
ERROR starting bot, check redis and ES are running and diskover.cfg settings.
can you start the bots manually with this?
cd /app/diskover
python ./diskover_worker_bot.py
Before you run diskover-bot-launcher.sh you have to change into the diskover directory or update that path on line 17. Not sure how this broke in latest lsio image v1.5.0.3... maybe from me updating diskover-bot-launcher.sh ?
I've updated diskover-bot-launcher.sh to output if unable to find the .py files, please download latest and see what error you get https://github.com/shirosaidev/diskover/blob/master/diskover-bot-launcher.sh
I am able to start a worker with the following:
cd /app/diskover
python ./diskover_worker_bot.py
However, when I do the following I get the error ERROR starting bot, check redis and ES are running and diskover.cfg settings.
:
cd /app/diskover
./diskover-bot-launcher.sh
I see the following in the bot log:
14:12:41 Registering birth of worker 09d0ae94efaa.1219
14:12:41 RQ worker 'rq:worker:09d0ae94efaa.1219' started, version 0.13.0
14:12:41 *** Listening on diskover, diskover_crawl, diskover_calcdir...
14:12:41 Sent heartbeat to prevent worker timeout. Next one should arrive within 420 seconds.
14:12:41 Cleaning registries for queue: diskover
14:12:41 Cleaning registries for queue: diskover_crawl
14:12:41 Cleaning registries for queue: diskover_calcdir
14:12:41 *** Listening on diskover,diskover_crawl,diskover_calcdir...
14:12:41 Sent heartbeat to prevent worker timeout. Next one should arrive within 420 seconds.
I then see a bot in the RQ Dashboard.
If I run the diskover-bot-launcher.sh
script with bash -x
I see the following:
root@09d0ae94efaa:/app# bash -x diskover/diskover-bot-launcher.sh
+ PYTHON=python
+ DISKOVERBOT=./diskover_worker_bot.py
+ KILLREDISCONN=./killredisconn.py
+ WORKERBOTS=8
+ BURST=FALSE
+ BOTLOG=/config/bot.log
+ LOGLEVEL=3
+ BOTPIDS=/tmp/diskover_bot_pids
+ VERSION=1.6.1
+ KILLBOTS=FALSE
+ RESTARTBOTS=FALSE
+ REMOVEBOTS=FALSE
+ FORCEREMOVEBOTS=FALSE
+ SHOWBOTS=FALSE
+ getopts ':h?w:bskrRfl:V' opt
+ banner
++ tput setaf 1
++ tput sgr 0
+ '[' FALSE == TRUE ']'
+ '[' FALSE == TRUE ']'
+ '[' FALSE == TRUE ']'
+ '[' FALSE == TRUE ']'
+ startbots
+ echo 'Starting 8 worker bots in background...'
Starting 8 worker bots in background...
+ ARGS=
+ '[' FALSE == TRUE ']'
+ '[' 3 == 0 ']'
+ '[' 3 == 1 ']'
+ '[' 3 == 2 ']'
+ '[' 3 == 3 ']'
+ ARGS+='-l DEBUG'
+ (( i = 1 ))
+ (( i <= 8 ))
+ '[' '!' /config/bot.log ']'
+ '[' 1 -eq 1 + ']'
python ./diskover_worker_bot.py -l DEBUG
+ sleep 1
+ ps -p 2039
+ '[' 1 -gt 0 ']'
+ echo 'ERROR starting bot, check redis and ES are running and diskover.cfg settings.'
ERROR starting bot, check redis and ES are running and diskover.cfg settings.
+ exit 1
It's killing it after starting one bot. Looks like the if [ $? -gt 0 ]; then
portion of the check should read if [ $? -gt "${WORKERBOTS}" ]; then
because if I do that it shows it successfully creates 8 bots, and I can see 8 PIDs in the PID file, but I do not see the bots in the RQ Dashboard.
Now, if I run the dispatcher.sh
script with the above change in place, I see the following:
root@09d0ae94efaa:/app# ./dispatcher.sh
killing existing workers...
emptying current redis queues...
0 jobs removed from diskover_crawl queue
0 jobs removed from diskover queue
0 jobs removed from diskover_calcdir queue
0 jobs removed from failed queue
killing dangling workers...
starting workers with following options:
Starting 8 worker bots in background...
09d0ae94efaa.2756 (pid 2756) (botnum 1)
09d0ae94efaa.2763 (pid 2763) (botnum 2)
09d0ae94efaa.2765 (pid 2765) (botnum 3)
09d0ae94efaa.2767 (pid 2767) (botnum 4)
09d0ae94efaa.2769 (pid 2769) (botnum 5)
09d0ae94efaa.2771 (pid 2771) (botnum 6)
09d0ae94efaa.2773 (pid 2773) (botnum 7)
09d0ae94efaa.2775 (pid 2775) (botnum 8)
DONE!
All worker bots have started
Worker bot output is getting logged to /config/bot.log.botnum
Worker pids have been stored in /tmp/diskover_bot_pids, use -k flag to shutdown workers or -r to restart
Exiting, sayonara!
starting crawler with following options: --autotag -d /data -a -i diskover-2019-07-23
Now I see 8 bots in the RQ Dashboard so I believe you need to make the above change within Diskover for the new check you put in place.
Try with the latest diskover-bot-launcher.sh on diskover github. I've added in a new check for the paths to the .py files which get set at the top of the .sh file. I think the issue is you are not running the diskover-bot-launcher.sh from within the /app/diskover directory..
Ok, @shirosaidev and I have found the actual culprit:
root@5ce6acc0f6c2:/app/diskover# ps -p 801
ps: unrecognized option: p
We're working on a solution that will work for the container as well as everything else.
I've pushed a fix for this to v1.6.2 of diskover-bot-launcher.sh to remove the -p for any ps command in the .sh script. Thanks @christronyxyocum . I've rebuilt v1.5.0.3 of diskover release on diskover github with this update.
@mike391 New official image has been built with the fixes. I have a PR waiting with fixes for the Redis cleanup script not working as well so there should be another new image in the near future.
Once in a while I still am unable to crawl. Recreating the containers 1-2 times fixes it sometimes. Usually happens more frequently once I add DISKOVER_OPTS=-D -A and add an autotag rule in my diskover.cfg. My redis and elasticsearch containers dont show any errors in the logs, and im able to do a ping redis
and a ping elasticsearch
from within my diskover container.
killing existing workers...
emptying current redis queues...
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/redis/connection.py", line 493, in connect
sock = self._connect()
File "/usr/lib/python3.6/site-packages/redis/connection.py", line 550, in _connect
raise err
File "/usr/lib/python3.6/site-packages/redis/connection.py", line 538, in _connect
sock.connect(socket_address)
OSError: [Errno 99] Address not available
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/redis/client.py", line 754, in execute_command
connection.send_command(*args)
File "/usr/lib/python3.6/site-packages/redis/connection.py", line 619, in send_command
self.send_packed_command(self.pack_command(*args))
File "/usr/lib/python3.6/site-packages/redis/connection.py", line 594, in send_packed_command
self.connect()
File "/usr/lib/python3.6/site-packages/redis/connection.py", line 498, in connect
raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 99 connecting to None:6379. Address not available.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/redis/connection.py", line 493, in connect
sock = self._connect()
File "/usr/lib/python3.6/site-packages/redis/connection.py", line 550, in _connect
raise err
File "/usr/lib/python3.6/site-packages/redis/connection.py", line 538, in _connect
sock.connect(socket_address)
OSError: [Errno 99] Address not available
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/bin/rq", line 11, in <module>
sys.exit(main())
File "/usr/lib/python3.6/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/usr/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/lib/python3.6/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/lib/python3.6/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/rq/cli/cli.py", line 76, in wrapper
return ctx.invoke(func, cli_config, *args[1:], **kwargs)
File "/usr/lib/python3.6/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/rq/cli/cli.py", line 109, in empty
num_jobs = queue.empty()
File "/usr/lib/python3.6/site-packages/rq/queue.py", line 117, in empty
return script(keys=[self.key])
File "/usr/lib/python3.6/site-packages/redis/client.py", line 3498, in __call__
return client.evalsha(self.sha, len(keys), *args)
File "/usr/lib/python3.6/site-packages/redis/client.py", line 2704, in evalsha
return self.execute_command('EVALSHA', sha, numkeys, *keys_and_args)
File "/usr/lib/python3.6/site-packages/redis/client.py", line 760, in execute_command
connection.send_command(*args)
File "/usr/lib/python3.6/site-packages/redis/connection.py", line 619, in send_command
self.send_packed_command(self.pack_command(*args))
File "/usr/lib/python3.6/site-packages/redis/connection.py", line 594, in send_packed_command
self.connect()
File "/usr/lib/python3.6/site-packages/redis/connection.py", line 498, in connect
raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 99 connecting to None:6379. Address not available.
killing dangling workers...
starting workers with following options:
________ .__ __
\______ \ |__| _____| | _________ __ ___________
| | \| |/ ___/ |/ / _ \ \/ // __ \_ __ \ /)___(\
| ` \ |\___ \| < <_> ) /\ ___/| | \/ (='.'=)
/_______ /__/____ >__|_ \____/ \_/ \___ >__| ("\)_("\)
\/ \/ \/ \/
Worker Bot Launcher v1.6.2
https://github.com/shirosaidev/diskover
"Crawling all your stuff, core melting time"
Starting 8 worker bots in background...
e0349ba02c6d.471 (pid 471) (botnum 1)
e0349ba02c6d.477 (pid 477) (botnum 2)
e0349ba02c6d.479 (pid 479) (botnum 3)
e0349ba02c6d.481 (pid 481) (botnum 4)
e0349ba02c6d.483 (pid 483) (botnum 5)
e0349ba02c6d.485 (pid 485) (botnum 6)
e0349ba02c6d.487 (pid 487) (botnum 7)
e0349ba02c6d.489 (pid 489) (botnum 8)
DONE!
All worker bots have started
Worker pids have been stored in /tmp/diskover_bot_pids, use -k flag to shutdown workers or -r to restart
Exiting, sayonara!
starting crawler with following options: -D -d /data -a -i diskover-2019-07-24
/usr/lib/python3.6/site-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.25.3) or chardet (3.0.4) doesn't match a supported version!
RequestsDependencyWarning)
___ ___ ___ ___ ___ ___ ___ ___
/\ \ /\ \ /\ \ /\__\ /\ \ /\__\ /\ \ /\ \
/::\ \ _\:\ \ /::\ \ /:/ _/_ /::\ \ /:/ _/_ /::\ \ /::\ \
/:/\:\__\ /\/::\__\ /\:\:\__\ /::-"\__\ /:/\:\__\ |::L/\__\ /::\:\__\ /::\:\__\
\:\/:/ / \::/\/__/ \:\:\/__/ \;:;-",-" \:\/:/ / |::::/ / \:\:\/ / \;:::/ /
\::/ / \:\__\ \::/ / |:| | \::/ / L;;/__/ \:\/ / |:\/__/
\/__/ \/__/ \/__/ \|__| \/__/ \/__/ \|__|
v1.5.0.3
https://shirosaidev.github.io/diskover
Bringing light to the darkness.
Support diskover on Patreon or PayPal :)
2019-07-24 10:39:38,991 [INFO][diskover] Using config file: /app/diskover/diskover.cfg
2019-07-24 10:39:39,005 [INFO][diskover] Found 15 diskover RQ worker bots
2019-07-24 10:39:39,005 [INFO][diskover] Searching diskover-2019-07-24 for duplicate file hashes...
2019-07-24 10:39:39,009 [WARNING][elasticsearch] POST http://elasticsearch:9200/diskover-2019-07-24/_refresh [status:404 request:0.003s]
Traceback (most recent call last):
File "./diskover.py", line 2045, in <module>
dupes_finder(es, q, cliargs, logger)
File "/app/diskover/diskover_dupes.py", line 304, in dupes_finder
es.indices.refresh(index=cliargs['index'])
File "/usr/lib/python3.6/site-packages/elasticsearch5/client/utils.py", line 73, in _wrapped
return func(*args, params=params, **kwargs)
File "/usr/lib/python3.6/site-packages/elasticsearch5/client/indices.py", line 56, in refresh
'_refresh'), params=params)
File "/usr/lib/python3.6/site-packages/elasticsearch5/transport.py", line 312, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/usr/lib/python3.6/site-packages/elasticsearch5/connection/http_urllib3.py", line 129, in perform_request
self._raise_error(response.status, raw_data)
File "/usr/lib/python3.6/site-packages/elasticsearch5/connection/base.py", line 125, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch5.exceptions.NotFoundError: TransportError(404, 'index_not_found_exception', 'no such index')```
If you want to use auto-tagging the correct option in your container config would just be:
"DISKOVER_OPTS=--autotag"
The first big error is due to broken cleanup script for Redis, which is fixed in my current PR that's waiting to be merged, but is pretty much ignorable.
Second error is saying that the index is not found within the Elasticsearch cluster as it seems that you are trying to run Diskover with the --finddupes
option before a normal crawl is ran to create the index.
Ah I see, Ill try that for autotagging, also I assumed that diskover would run a normal crawl before finding dupes. This all makes sense now, thanks for your patience!
No problem. You can kick off a crawl with the following:
docker exec -it diskover /app/dispatcher.sh
Replacing diskover
with your container name. After that finishes, which it now should, you can run the following to find any duplicate files:
docker exec -it diskover /usr/bin/python app/diskover/diskover.py -i diskover-2019-07-23 --finddupes
@mike391 Have things been working okay for you?
All changes merged, the new image will be available shortly.
After updating the image to the latest, in the webui I keep getting the error: No diskover indices found in Elasticsearch. Please run a crawl and come back.