Closed metehan-arslan closed 4 months ago
👋 Hi @metehan-arslan, Issues is only for reporting a bug/feature request. Please read documentation before raising an issue https://rengine.wiki For very limited support, questions, and discussions, please join reNgine Discord channel: https://discord.gg/azv6fzhNCE Please include all the requested and relevant information when opening a bug report. Improper reports will be closed without any response.
To add a bit more context here: A modification from the default Dockerfile was made: arch is arm64 (for OS and go)
Probably one of this lib install breaks celery : https://github.com/laramies/theHarvester/blob/master/requirements/base.txt
Generally:
_Removing the following line fixed the loop:
https://github.com/yogeshojha/rengine/blob/master/web/celery-entrypoint.sh#L81_
It didn't fix this for me. Any other workaround? @Talanor @metehan-arslan
_Removing the following line fixed the loop: https://github.com/yogeshojha/rengine/blob/master/web/celery-entrypoint.sh#L81_
It didn't fix this for me. Any other workaround? @Talanor @metehan-arslan
If you have initialized your container once, edit your ./web/celery-entrypoint.sh
to keep only the celery workers launch lines at the end, which look something like:
echo "Starting Workers..."
echo "Starting Main Scan Worker with Concurrency: $MAX_CONCURRENCY,$MIN_CONCURRENCY"
watchmedo auto-restart --recursive --pattern="*.py" --directory="/usr/src/app/reNgine/" -- celery -A reNgine.tasks worker --loglevel=info --autoscale=$MAX_CONCURRENCY,$MIN_CONCURRENCY -Q main_scan_queue &
[...]
watchmedo auto-restart --recursive --pattern="*.py" --directory="/usr/src/app/reNgine/" -- celery -A reNgine.tasks worker --pool=gevent --concurrency=10 --loglevel=info -Q theHarvester_queue -n theHarvester_worker
exec "$@"
Then docker compose down
and docker compose up
the celery container.
If the issue persits, it is due to something else.
If it works, some pip install breaks celery. Add back lines you deleted slowly and down/up the celery container until it breaks to find the culprit.
I'm working on a container with venvs & pipx, but in the meantime that'll get you running.
https://github.com/yogeshojha/rengine/issues/1248 This issue seems to be related. I was having the same output.
1248 This issue seems to be related. I was having the same output.
Unlikely, Infoga isn't cloned since it doesn't exist, so it can't make celery fail. More likely, infoga wasn't cloned, hence the error, AND you had a broken install due to this issue.
Hi, this looks very familiar to me. Which branch did you clone, is it master or release/2.1.0?
I had this exact issue when I tried to use ollama with celery and it has known issues.
But on master this is very strange.
This is on master, the discord is full of people with clean install having that bug.
If you have initialized your container once, edit your
./web/celery-entrypoint.sh
to keep only the celery workers launch lines at the end, which look something like:
I have the same behaviour in a fresh install of Ubuntu Server 240.4, I follow your advise and comment the lines in ./web/celery-entrypoint.sh
. I found that the error was generated by theHarvester in this line:
python3 -m pip install -r /usr/src/github/theHarvester/requirements/base.txt
In the file base.txt
the library fastapi==0.111.0
is the culprit.
I hope that this helps
@Nandolorian did you downgrade or upgrade the fastapi version? When I was testing 2.1.0 I found out that asyncio was the culprit. Not sure why fastapi has issues with celery
@yogeshojha I downgraded to 0.110.3 and the error didn't happen. I run some scans using the OSINT scan engine and theHarvester runs without any trouble.
@Nandolorian Thank you! I am downgrading fastapi and let me try doing the installation!
@Nandolorian I tired with downgraded fastapi, sadly it doesnt work. Do you mind sharing with me all the requirements version?
You can do this by
docker exec -it rengine-celery-1 bash
and then
pip freeze
Okay I think httpcore
is the culprit here.
https://github.com/python-trio/trio/issues/2848
I had this exact same issue when using ollama-python because it uses httpcore
library and our celery workers are gevent based, httpcore which is a coroutine-based networking library and uses blocking I/O, which conflicts with gevent's cooperative multitasking model as per my understanding.
I guess finding which tool uses httpcore and removing them would solve this.
Or, installing tools in venvs ;)
@Talanor yeah venv would be better, but either ways when any of the tools reNgine uses httpcore it wont be able to work with gevent and celery. We might have to change the way these tools run outside celery or use another event pool. But I am open to hearing how you think venv will help us solve this?
@Nandolorian I tired with downgraded fastapi, sadly it doesnt work. Do you mind sharing with me all the requirements version?
You can do this by
docker exec -it rengine-celery-1 bash
and then
pip freeze
I see you have discovered that httpcore
is the problem. Anyway, I am posting the list for your reference in case it is still useful.
@Talanor yeah venv would be better, but either ways when any of the tools reNgine uses httpcore it wont be able to work with gevent and celery. We might have to change the way these tools run outside celery or use another event pool. But I am open to hearing how you think venv will help us solve this?
Please see my PR #1250 that adresses the issue while staying on the current versions. The concept is that each tool is in its own virtual environment, so you can have multiple httpcore (or whatever else) versions installed without conflicts
@Talanor your PR looks great, I liked the usage of poetry.
The problem is not conflicting versions of httcore or having multiple versions in same environment
and our celery workers are gevent based, httpcore which is a coroutine-based networking library and uses blocking I/O, which conflicts with gevent's cooperative multitasking model as per my understanding.
@Talanor your PR looks great, I liked the usage of poetry.
The problem is not conflicting versions of httcore or having multiple versions in same environment
and our celery workers are gevent based, httpcore which is a coroutine-based networking library and uses blocking I/O, which conflicts with gevent's cooperative multitasking model as per my understanding.
I must be missing something. I don't see httpcore in their pip freeze list?
@Nandolorian can you confirm newest master works on a fresh install for you?
Upon further investigation: The celery workers from my PR do not install httpcore (as seen in my venv):
talanor@pentest:~/containers/reNgine-CaRE$ docker run --entrypoint /bin/bash -it talanor/rengine-celery:v0.3
rengine@909fc90322cc:~/rengine$ ls
rengine@909fc90322cc:~/rengine$ cd
rengine@909fc90322cc:~$ ls
nuclei-templates poetry.lock pyproject.toml rengine results scan_results tools wordlists
rengine@909fc90322cc:~$ poetry -C . shell
Spawning shell within /home/rengine/.cache/pypoetry/virtualenvs/celery-rengine-HmEJnPQT-py3.10
rengine@909fc90322cc:~$ . /home/rengine/.cache/pypoetry/virtualenvs/celery-rengine-HmEJnPQT-py3.10/bin/activate
(celery-rengine-py3.10) rengine@909fc90322cc:~$ pip list
Package Version
-------------------------------- -----------
aiodns 3.0.0
aiohttp 3.9.5
aiosignal 1.3.1
amqp 5.2.0
appdirs 1.4.4
argh 0.26.2
asgiref 3.8.1
async-timeout 4.0.3
attrs 23.2.0
beautifulsoup4 4.9.3
billiard 4.2.0
Brotli 1.1.0
celery 5.4.0
certifi 2024.2.2
cffi 1.16.0
charset-normalizer 3.3.2
click 8.1.7
click-didyoumean 0.3.1
click-plugins 1.1.1
click-repl 0.3.0
coreapi 2.3.3
coreschema 0.0.4
cron-descriptor 1.4.3
cssselect2 0.7.0
decorator 5.1.1
Deprecated 1.2.14
discord-webhook 1.3.0
Django 3.2.4
django-ace 1.0.11
django-celery-beat 2.6.0
django-login-required-middleware 0.6.1
django-mathfilters 1.0.0
django-role-permissions 3.2.0
django-timezone-field 6.1.0
djangorestframework 3.12.4
djangorestframework-datatables 0.6.0
dotted-dict 1.1.3
drf-yasg 1.21.3
et-xmlfile 1.1.0
filelock 3.14.0
fonttools 4.51.0
frozenlist 1.4.1
gevent 24.2.1
greenlet 3.0.3
gunicorn 22.0.0
html5lib 1.1
humanize 4.3.0
idna 3.7
inflection 0.5.1
itypes 1.2.0
Jinja2 3.1.4
kombu 5.3.7
lxml 5.2.1
Markdown 3.3.4
MarkupSafe 2.1.5
metafinder 1.2
multidict 6.0.5
netaddr 0.8.0
netlas 0.4.1
openai 0.28.0
openpyxl 3.1.2
orjson 3.9.0
packaging 24.0
pikepdf 8.15.1
pillow 10.3.0
pip 24.0
prettytable 2.1.0
prompt-toolkit 3.0.43
psycopg2 2.9.7
pycares 4.4.0
pycparser 2.22
pycvesearch 1.0
pydyf 0.10.0
Pygments 2.18.0
pyphen 0.15.0
PySocks 1.7.1
python-crontab 3.0.0
python-dateutil 2.9.0.post0
python-docx 1.1.2
python-pptx 0.6.23
pytz 2024.1
PyYAML 6.0.1
redis 5.0.3
requests 2.31.0
requests-file 2.0.0
ruamel.yaml 0.18.6
ruamel.yaml.clib 0.2.8
scapy 2.4.3
setuptools 69.5.1
simplejson 3.17.2
six 1.16.0
soupsieve 2.5
sqlparse 0.5.0
tinycss2 1.3.0
tinydb 4.4.0
tldextract 3.5.0
tqdm 4.66.4
typing_extensions 4.11.0
tzdata 2024.1
uritemplate 4.1.1
urllib3 2.2.1
uro 1.0.0
validators 0.18.2
vine 5.1.0
watchdog 4.0.0
wcwidth 0.2.13
weasyprint 53.3
webencodings 0.5.1
whatportis 0.8.2
wrapt 1.16.0
XlsxWriter 3.2.0
xmltodict 0.13.0
yarl 1.9.4
zope.event 5.0
zope.interface 6.3
zopfli 0.2.3
However, it is installed as a fastapi dependency from theHarvester. Installing theHarvester via venv do not install httpcore in the celery environment, and does not introduce conflict.
Basically, if you can fix it by hot removing a package via pip, its something that can (should) be solved with venvs.
Is there an existing issue for this?
Current Behavior
Rengine scans stuck at pending, whois doesn't works. make logs shows celery loop.
Thanks to Talanor from discord we were able to identify the issue. Running pip as root causing to crash existing system dependencies.
Removing the following line fixed the loop: https://github.com/yogeshojha/rengine/blob/master/web/celery-entrypoint.sh#L81
Expected Behavior
Scans to work, loops in celery shouldn't happen.
Steps To Reproduce
git clone https://github.com/yogeshojha/rengine.git sudo ./install
Environment
Anything else?
see also: https://github.com/yogeshojha/rengine/issues/1234