AICoE / prometheus-anomaly-detector

A newer more updated version of the prometheus anomaly detector (https://github.com/AICoE/prometheus-anomaly-detector-legacy)
GNU General Public License v3.0
597 stars 151 forks source link

OSError: [Errno 9] Bad file descriptor #126

Closed dorroddorrod closed 3 years ago

dorroddorrod commented 4 years ago

Hi, This looks really nice project but i'm trying to run in locally on my mac with no luck. After the first data training i'm getting an exception from multiprocessing library


2020-07-24 21:15:33,084:INFO:__main__: Will retrain model every 15 minutes
Process Process-1:
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
    self.asyncio_loop.run_forever()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 539, in run_forever
    self._run_once()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 1739, in _run_once
    event_list = self._selector.select(timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/selectors.py", line 558, in select
    kev_list = self._selector.control(None, max_ev, timeout)
OSError: [Errno 9] Bad file descriptor```

Python 3.7.2

Please advise 
4n4nd commented 4 years ago

hmm this is weird, do you mind trying a different python version?

dorroddorrod commented 4 years ago

Which one ?

4n4nd commented 4 years ago

3.8 maybe?

dorroddorrod commented 4 years ago

Now i'm getting different error :


Traceback (most recent call last):
  File "/Users/****l/PycharmProjects/prometheus-anomaly-detector/app.py", line 164, in <module>
    server_process.start()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'BaseAsyncIOLoop.initialize.<locals>.assign_thread_identity'
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/util.py", line 300, in _run_finalizers
    finalizer()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/util.py", line 224, in __call__
    res = self._callback(*self._args, **self._kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/queues.py", line 195, in _finalize_join
    thread.join()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py", line 1011, in join
    self._wait_for_tstate_lock()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py", line 1027, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt```
4n4nd commented 4 years ago

Could you please enable the the debug mode by setting FLT_DEBUG_MODE=True and show me the log output?

dorroddorrod commented 4 years ago

Same error :


2020-07-25 09:43:21,758:DEBUG:asyncio: Using selector: KqueueSelector
Traceback (most recent call last):
  File "/Users/dormull/PycharmProjects/prometheus-anomaly-detector/app.py", line 166, in <module>
    server_process.start()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'BaseAsyncIOLoop.initialize.<locals>.assign_thread_identity'
dorroddorrod commented 4 years ago

Any updates ?

4n4nd commented 4 years ago

sorry I haven't gotten to it yet :(

4n4nd commented 4 years ago

@dorroddorrod are you still seeing the same issue?

sachincse commented 4 years ago

@4n4nd I am facing the same issue, please guide how to resolve it.

goern commented 4 years ago

@fridex is that something we have seen before?

/kind bug

fridex commented 4 years ago

Haven't seen this in our projects.

4n4nd commented 4 years ago

@sachincse I recently updated some dependencies, can you please try again with the latest version? I think this was an issue with one of the deps which should be fixed in a newer version.

fridex commented 4 years ago

@sachincse I recently updated some dependencies, can you please try again with the latest version? I think this was an issue with one of the deps which should be fixed in a newer version.

If you could provide a listing of deps that caused issues, that would be great. We can plug it to Thoth's recommendation engine so users do not encounter these.

4n4nd commented 4 years ago

@fridex this was the pip.lock Thanks!

Carmezim commented 4 years ago

Running into the same issue after working around the pickle one

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/multiprocess/process.py", line 315, in _bootstrap
    self.run()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/multiprocess/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/tornado/platform/asyncio.py", line 149, in start
    self.asyncio_loop.run_forever()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/base_events.py", line 570, in run_forever
    self._run_once()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/base_events.py", line 1823, in _run_once
    event_list = self._selector.select(timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/selectors.py", line 558, in select
    kev_list = self._selector.control(None, max_ev, timeout)
OSError: [Errno 9] Bad file descriptor
4n4nd commented 4 years ago

@Carmezim can you run pip freeze in your environment and paste the output here? I suspect this is because some of your dependencies are not up to date.

Carmezim commented 4 years ago

@4n4nd sure, thanks for taking a look :)

alembic==1.4.1
appdirs==1.4.4
argcomplete==1.12.1
arrow==0.17.0
attrs==20.2.0
azure-core==1.8.2
azure-storage-blob==12.5.0
bcrypt==3.2.0
binaryornot==0.4.4
black==20.8b1
bump2version==1.0.1
bumpversion==0.6.0
cached-property==1.5.2
cachetools==4.1.1
certifi==2020.6.20
cffi==1.14.3
chardet==3.0.4
click==7.1.2
clickclick==20.10.2
cloudpickle==1.6.0
cmdstanpy==0.9.5
connexion==2.7.0
convertdate==2.2.2
cookiecutter==1.6.0
cryptography==2.8
cycler==0.10.0
Cython==0.29.21
databricks-cli==0.13.0
dateparser==0.7.6
decorator==4.4.2
dill==0.3.2
distlib==0.3.1
distro==1.5.0
docker==4.3.1
docker-compose==1.27.4
dockerpty==0.4.1
docopt==0.6.2
drone==0.3.0
entrypoints==0.3
ephem==3.7.7.1
Faker==4.14.0
fbprophet==0.7.1
filelock==3.0.12
Flask==1.1.2
freezegun==1.0.0
future==0.18.2
gitdb==4.0.5
GitPython==3.1.11
google-auth==1.22.1
gorilla==0.3.0
gunicorn==20.0.4
h5py==2.10.0
holidays==0.10.3
httpie==2.2.0
idna==2.10
inflection==0.5.1
iniconfig==1.1.1
install==1.3.4
isodate==0.6.0
itsdangerous==1.1.0
Jinja2==2.11.2
jinja2-time==0.2.0
joblib==0.17.0
jsonpath-rw==1.4.0
jsonschema==3.2.0
Keras==2.4.3
kiwisolver==1.2.0
korean-lunar-calendar==0.2.1
kubernetes==12.0.0
LunarCalendar==0.0.9
Mako==1.1.3
MarkupSafe==1.1.1
matplotlib==3.3.2
mlflow==1.11.0
more-itertools==5.0.0
msrest==0.6.19
multiprocess==0.70.10
mypy-extensions==0.4.3
numpy==1.19.2
oauthlib==3.1.0
openapi-spec-validator==0.2.9
packaging==20.4
pandas==1.1.3
paramiko==2.7.2
pathos==0.2.6
pathspec==0.8.0
pg8000==1.16.6
Pillow==8.0.1
pipenv==2020.8.13
pluggy==0.13.1
ply==3.11
pox==0.2.8
poyo==0.5.0
ppft==1.6.6.2
prettytable==1.0.1
prometheus-api-client==0.4.1
prometheus-client==0.8.0
prometheus-flask-exporter==0.18.1
prompt-toolkit==1.0.15
protobuf==3.13.0
py==1.9.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser==2.20
Pygments==2.7.1
PyMeeus==0.3.7
PyNaCl==1.4.0
pyparsing==2.4.7
pyrsistent==0.17.3
pystan==2.19.1.1
pytest==6.1.1
pytest-pgsql==1.1.2
python-dateutil==2.8.1
python-dotenv==0.14.0
python-editor==1.0.4
pytz==2019.1
PyYAML==5.1
querystring-parser==1.2.4
regex==2020.10.15
requests==2.24.0
requests-mock==1.8.0
requests-oauthlib==1.3.0
rsa==4.6
schedule==0.6.0
scikit-learn==0.23.2
scipy==1.5.3
scramp==1.2.0
setuptools-git==1.2
six==1.15.0
smmap==3.0.4
SQLAlchemy==1.3.13
sqlparse==0.4.1
sseclient-py==1.7
st2client==3.3.0
tabulate==0.8.7
testing.common.database==2.0.3
testing.postgresql==1.3.0
text-unidecode==1.3
texttable==1.6.3
threadpoolctl==2.1.0
toml==0.10.1
tornado==6.0.4
tqdm==4.50.2
typed-ast==1.4.1
typing-extensions==3.7.4.3
tzlocal==2.1
urllib3==1.25.11
virtualenv==20.0.35
virtualenv-clone==0.5.4
wcwidth==0.2.5
websocket-client==0.57.0
Werkzeug==1.0.1
whichcraft==0.6.1
zipp==1.0.0
4n4nd commented 4 years ago

@Carmezim I couldn't replicate the issue :( Do you mind trying a different fresh environment? I would recommend using miniconda for environments.

Carmezim commented 4 years ago

@4n4nd no worries, thanks for trying anyway. I will try some different envs.

nagarakesh4 commented 4 years ago

i had this same issue, neither #92 nor solutions given here worked. Instead of running the app.py locally, i deployed the app as docker image (and so no version dependency). Also while using docker image and if you are unable to access the /metrics endpoint then probably it might be because of the host network mode flag, remove the --network host flag and it should start showing up at /metrics.

dks0408070 commented 3 years ago

I encounter this same issue. Have started new venv, ran pip requirements -r requirements.txt, and launch the app to get Bad file description. Docker container doesn't return error. Previous comment suggests out of date dependencies; is it possible to get a list of depenency versions, as requirements.txt is versionless,

4n4nd commented 3 years ago

@dks0408070 the pipfile.lock should have all the required versions.

fridex commented 3 years ago

Sorry, I'm a little bit of context here. Do you have a specific package that is causing these issues? Or any combination of packages? It might be a good observation for Thoth to give recommendations on this.

dks0408070 commented 3 years ago

@fridex I have yet to determine root cause, although I believe that may be the case. I am now verifying dependency versions in my local venv vs. pipfile.lock. I've only had success running the dockerized application thus far.

fridex commented 3 years ago

Hm... the first backtrace shows an issue in tornado, but its version did not change in the linked lock files. Subsequent backtraces show a different issue.

@fridex I have yet to determine root cause, although I believe that may be the case. I am now verifying dependency versions in my local venv vs. pipfile.lock. I've only had success running the dockerized application thus far.

Cool! Please let us know if you spot the troublemaker.

sesheta commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

sesheta commented 3 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

sesheta commented 3 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

/close

sesheta commented 3 years ago

@sesheta: Closing this issue.

In response to [this](https://github.com/AICoE/prometheus-anomaly-detector/issues/126#issuecomment-967884961): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.