bluesky / ophyd-async

Hardware abstraction for bluesky written using asyncio
https://blueskyproject.io/ophyd-async
BSD 3-Clause "New" or "Revised" License
7 stars 21 forks source link

Conda install of ophyd incompatible with ophyd-async #89

Open olliesilvester opened 7 months ago

olliesilvester commented 7 months ago

Upon starting Hyperion using the latest ophyd-async release, we encountered the following error:

*** Error in `python': corrupted size vs. prev_size: 0x000055c258f67dd0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7f474)[0x7facdcb3e474]
/lib64/libc.so.6(+0x8156b)[0x7facdcb4056b]
/dls_sw/i03/software/bluesky/hyperion_v8.2.0/hyperion/.venv/lib/python3.10/site-packages/epicscorelibs/lib/./libCom.so.7.0.7.99.0(_ZN9fdManagerD1Ev+0xd3)[0x7facc8ceb6b3]
/lib64/libc.so.6(+0x39ce9)[0x7facdcaf8ce9]
/lib64/libc.so.6(+0x39d37)[0x7facdcaf8d37]
/lib64/libc.so.6(__libc_start_main+0xfc)[0x7facdcae155c]
python(+0x1cb421)[0x55c256a7d421]

followed by a long memory map.

Additional info: We are only importing opyhd-async and not actually using it in the code - but this still breaks upon using regular ophyd deviecs

coretl commented 7 months ago

Can you do a pip freeze to see versions of dependencies? Is this on process exit, or does it crash when you are in the middle of something?

olliesilvester commented 7 months ago

I'm struggling to find the point at which this occurs, it seems to not be in the same place each time. It occurs sometime between hyperion launching and connecting to regular ophyd devices. Pip freeze is:

accessible-pygments==0.0.4
aioca==1.7
aiohttp==3.9.1
aiosignal==1.3.1
alabaster==0.7.13
aniso8601==9.0.1
anyio==4.0.0
appdirs==1.4.4
asciitree==0.3.3
asttokens==2.4.0
async-timeout==4.0.3
attrs==23.1.0
Babel==2.13.0
backcall==0.2.0
beautifulsoup4==4.12.2
bidict==0.22.1
black==23.10.1
blinker==1.6.3
blueapi==0.3.15
bluesky==1.12.0.post1+g9b0d729
bluesky-kafka==0.10.0
bluesky-live==0.0.8
boltons==23.0.0
build==1.0.3
cachetools==5.3.1
caproto==1.1.0
certifi==2023.7.22
cfgv==3.4.0
chardet==5.2.0
charset-normalizer==3.3.1
click==8.1.3
cloudpickle==3.0.0
colorama==0.4.6
comm==0.1.4
confluent-kafka==2.2.0
contourpy==1.1.1
coverage==7.3.2
cycler==0.12.1
dask==2023.10.0
databroker==1.2.5
dataclasses-json==0.6.1
decorator==5.1.1
Deprecated==1.2.14
distlib==0.3.7
dls-bluesky-core==0.0.1
-e git+ssh://git@github.com/DiamondLightSource/dodal.git@af16a0451d05640375686687651843e822dc9de8#egg=dls_dodal
dnspython==2.4.2
docopt==0.6.2
doct==1.1.0
docutils==0.20.1
email-validator==2.1.0.post1
entrypoints==0.4
epicscorelibs==7.0.7.99.0.2
event-model==1.19.8
exceptiongroup==1.1.3
executing==2.0.0
fastapi==0.98.0
fasteners==0.19
filelock==3.12.4
Flask==3.0.0
Flask-RESTful==0.3.10
fonttools==4.43.1
freephil==0.2.1
frozenlist==1.4.0
fsspec==2023.10.0
gitdb==4.0.11
GitPython==3.1.40
googleapis-common-protos==1.56.1
graypy==2.1.0
greenlet==3.0.0
grpcio==1.59.0
h11==0.14.0
h5py==3.10.0
hdf5plugin==4.2.0
HeapDict==1.0.1
historydict==1.2.6
httpcore==0.18.0
httptools==0.6.1
httpx==0.25.0
humanize==4.8.0
-e git+ssh://git@github.com/DiamondLightSource/hyperion.git@d15b3ace89d698caa322a0ce8498402ab5b2eae2#egg=hyperion
identify==2.5.30
idna==3.4
imageio==2.31.6
imagesize==1.4.1
importlib-metadata==6.8.0
importlib-resources==6.1.0
iniconfig==2.0.0
intake==0.6.4
ipython==8.16.1
ipywidgets==8.1.1
ispyb==8.0.1
itsdangerous==2.1.2
jedi==0.19.1
Jinja2==3.1.2
jsonschema==4.19.1
jsonschema-specifications==2023.7.1
jupyterlab-widgets==3.0.9
kiwisolver==1.4.5
livereload==2.6.3
locket==1.0.0
MarkupSafe==2.1.3
marshmallow==3.20.1
matplotlib==3.8.0
matplotlib-inline==0.1.6
mockito==1.4.0
mongoquery==1.4.2
msgpack==1.0.7
msgpack-numpy==0.4.8
multidict==6.0.4
mypy==1.6.1
mypy-extensions==1.0.0
mysql-connector-python==8.1.0
networkx==3.2
nexgen @ git+https://github.com/dials/nexgen.git@db4858f6d91a3d07c6c0f815ef752849c0bf79d4
nodeenv==1.8.0
nose2==0.14.0
nslsii==0.9.1
numcodecs==0.12.1
numpy==1.26.1
opencv-python-headless==4.8.1.78
opentelemetry-api==1.20.0
opentelemetry-distro==0.41b0
opentelemetry-exporter-jaeger==1.20.0
opentelemetry-exporter-jaeger-proto-grpc==1.20.0
opentelemetry-exporter-jaeger-thrift==1.20.0
opentelemetry-instrumentation==0.41b0
opentelemetry-sdk==1.20.0
opentelemetry-semantic-conventions==0.41b0
ophyd==1.9.0
ophyd-async==0.2.0
orjson==3.9.9
p4p==4.1.10
packaging==23.2
pandas==2.1.1
parso==0.8.3
partd==1.4.1
pathlib2==2.3.7.post1
pathspec==0.11.2
pexpect==4.8.0
pickleshare==0.7.5
pika==1.3.2
Pillow==10.0.1
PIMS==0.6.1
Pint==0.22
pipdeptree==2.13.0
platformdirs==3.11.0
pluggy==1.3.0
ply==3.11
pre-commit==3.5.0
prettytable==3.9.0
prompt-toolkit==3.0.39
protobuf==4.21.12
psutil==5.9.6
ptyprocess==0.7.0
pure-eval==0.2.2
pvxslibs==1.2.3
py==1.11.0
pydantic==1.10.13
pydata-sphinx-theme==0.14.1
pyepics==3.5.2
Pygments==2.16.1
pymongo==4.5.0
pyOlog==4.5.0
pyparsing==3.1.1
pyproject-api==1.6.1
pyproject_hooks==1.0.0
pytest==7.4.2
pytest-asyncio==0.21.1
pytest-cov==4.1.0
pytest-random-order==1.1.0
python-dateutil==2.8.2
python-dotenv==1.0.0
python-multipart==0.0.6
pytz==2023.3.post1
PyYAML==6.0.1
redis==5.0.1
referencing==0.30.2
requests==2.31.0
rpds-py==0.10.6
ruff==0.1.1
scanspec==0.6.3
scipy==1.11.3
semver==3.0.2
setuptools-dso==2.10a1
six==1.16.0
slicerator==1.1.0
smmap==5.0.1
sniffio==1.3.0
snowballstemmer==2.2.0
soupsieve==2.5
Sphinx==7.2.6
sphinx-autobuild==2021.3.14
sphinx-copybutton==0.5.2
sphinx_design==0.5.0
sphinxcontrib-applehelp==1.0.7
sphinxcontrib-devhelp==1.0.5
sphinxcontrib-htmlhelp==2.0.4
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.6
sphinxcontrib-serializinghtml==1.1.9
SQLAlchemy==1.4.49
stack-data==0.6.3
starlette==0.27.0
stomp.py==8.1.0
suitcase-mongo==0.4.0
suitcase-msgpack==0.3.0
suitcase-utils==0.5.4
super-state-machine==2.0.2
tabulate==0.9.0
thrift==0.16.0
tifffile==2023.9.26
tomli==2.0.1
toolz==0.12.0
tornado==6.3.3
tox==3.28.0
tox-direct==0.4
tqdm==4.66.1
traitlets==5.11.2
types-mock==5.1.0.2
types-PyYAML==6.0.12.12
types-requests==2.31.0.10
typing-inspect==0.9.0
typing_extensions==4.5.0
tzdata==2023.3
tzlocal==5.2
ujson==5.8.0
urllib3==2.0.7
uvicorn==0.23.2
uvloop==0.19.0
virtualenv==20.24.6
watchfiles==0.21.0
wcwidth==0.2.8
websocket-client==1.6.4
websockets==12.0
Werkzeug==3.0.0
widgetsnbextension==4.0.9
workflows==2.26
wrapt==1.15.0
xarray==2023.10.1
yarl==1.9.3
zarr==2.16.1
zict==2.2.0
zipp==3.17.0
zocalo==0.30.2
coretl commented 7 months ago

Please could you import aioca._catools._Context and print these variables sometime after you've connected a traditional ophyd device?

https://github.com/dls-controls/aioca/blob/216bcc3e91d7215cd7ccbfa6bbe942abe196fd5b/aioca/_catools.py#L1026-L1029

coretl commented 7 months ago

It would also be good to know if this line is ever run: https://github.com/bluesky/ophyd-async/blob/54deda30cae717472d1020bd5a67489380960ad8/src/ophyd_async/epics/_backend/_aioca.py#L194

It shouldn't be, as you say you are never using ophyd-async devices, so you will never call the connect function on an ophyd-async epics signal

olliesilvester commented 7 months ago

I'm not sure how helpful this will be, but it seems like we get this error whenever there's some other Exception caused in Hyperion. Eg if we have

if __name__ == "__main__":
    raise Exception("blah")

Then the logs show the same error. If I switch back to ophyd-async 0.1.0 then I don't see this. This is without importing ophyd-async, but it's in the virtual environment.

Also:

ca available: False
_ca_context: None
_channel_caches: {}
coretl commented 7 months ago

What changed between 0.1 and 0.2 is that a temporary import of epicscorelibs was put in when importing ophyd_async.core. My guess is that somehow pyepics is using a different libca to that import.

Can you try running this line and see what it prints? https://github.com/pyepics/pyepics/blob/c770f4cb9647c023e6e5eaede87c76f0524ab0ac/epics/ca.py#L332

olliesilvester commented 7 months ago

The return value was for find_libca: /dls_sw/apps/python/miniforge/4.10.0-0/envs/python3.10/epics/lib/linux-x86_64/libca.so

coretl commented 7 months ago

Ok, that is the problem, two versions of libca loaded. I will see if I can work out how that happened...

coretl commented 7 months ago

What is os.environ['PYEPICS_LIBCA']?

olliesilvester commented 7 months ago

There doesn't seem to be anything in os.environ with this key

coretl commented 7 months ago

I suspect that these lines are not being run: https://github.com/bluesky/ophyd/blob/9a0bf3f5fd8c98ee23e5d51c35db695c86a6a921/ophyd/__init__.py#L38-L42

But I can't see how that might be the case. Can you insert print statements in to see when these lines are being run, and whether it gives an import error?

olliesilvester commented 7 months ago

Yes, it looks like the import epicscorelibs.path.pyepics line is never being ran. Print statements around this line work, but the import itself doesn't happen, nor does it raise an import error

Edit: Actually it does import it - after running this line, os.environ['PYEPICS_LIBCA'] is /dls_sw/apps/python/miniforge/4.10.0-0/envs/python3.10/epics/lib/linux-x86_64/libca.so

olliesilvester commented 7 months ago

Pylance also says the module can't be accessed and the line is greyed out

coretl commented 7 months ago

What happens if you import epicscorelibs.path.pyepics in your code?

olliesilvester commented 7 months ago

It imports it, and the environment is the same as the above

coretl commented 7 months ago

Ok, looks like epicscorelibs.path.pyepics won't set the environment variable if it is already sets. Looks like you are using the conda python in hyperion, so it is already set (possibly by https://github.com/conda-forge/pyepics-feedstock/blob/7674e88710d481bc231627d37070db3dd6380b43/recipe/build.sh#L5).

Quick fix: unset PYEPICS_LIBCA before running hyperion Longer term: Maybe we should do the unsetting here: https://github.com/bluesky/ophyd/blob/master/ophyd/__init__.py#L39

@tacaswell @danielballan @mrakitin any thoughts?

coretl commented 7 months ago

Alternatively we could get epicscorelibs to be more forceful and override the environment variable even if it is already set... https://github.com/mdavidsaver/epicscorelibs/blob/cfdc449eb0e4e4efd6beac3eb323051c9a2128aa/src/python/epicscorelibs/path/pyepics.py#L21C38-L21C38

coretl commented 1 month ago

Ok, looks like epicscorelibs.path.pyepics won't set the environment variable if it is already sets. Looks like you are using the conda python in hyperion, so it is already set (possibly by https://github.com/conda-forge/pyepics-feedstock/blob/7674e88710d481bc231627d37070db3dd6380b43/recipe/build.sh#L5).

Quick fix: unset PYEPICS_LIBCA before running hyperion Longer term: Maybe we should do the unsetting here: https://github.com/bluesky/ophyd/blob/master/ophyd/__init__.py#L39

@tacaswell @danielballan @mrakitin any thoughts?

bump

tacaswell commented 1 month ago

I think we should go the other way and rely on the conda-packaged epics libs when in conda-land.

coretl commented 1 month ago

Ok how about this as a strategy:

The only thing I'm not sure about is whether to make epicscorelibs a dependency of aioca in pip, or whether to make people type pip install aioca[epicscorelibs] to get the bundled libs...

mrakitin commented 1 month ago

Sounds good to me. I think pip install aioca[epicscorelibs] is a bit more flexible to support libs injection from conda and override from pip if needed.

coretl commented 1 month ago

Can you make pip install aioca install epicscorelibs and conda install aioca not?

mrakitin commented 1 month ago

Yes, conda dependencies are managed independently, but we want epicscorelibs there from conda, right? We can have different variants of the package too, such as we did for bluesky: https://github.com/conda-forge/bluesky-feedstock/blob/main/recipe/meta.yaml#L26 (and similar to tiled).