Open bgruening opened 3 years ago
@bgruening Thanks for your interest!
First of all, the activity on our GitHub repositories is low due to the fact that the main development in Onedata happens through our self-hosted Jira+Bamboo suite, where we can run more comprehensive CI tests than on public CI platforms, and we simply push all release and develop branches daily to GitHub...
With respect to fs-onedatafs
package specifically, this is just a Python wrapper over our C++ client, so this particular repository doesn't change too often, and the PyPI package was pushed just once for the sake of publishing docs, as the fs-onedatafs
package on its own is not usable, you need to have preinstalled the C++ client and libraries.
Currently, the preferred way to install is through Conda, however I'm currently struggling in making it work on Python 3.8+ as the conda dependency resolver fails on the dependencies before even starting a build. As to the protobuf
and libtbb
they are pinned in the oneclient
repository: https://github.com/onedata/oneclient/blob/release/20.02.7/conda/onedatafs/meta.yaml#L57-L64, at least since version 20.02.6
. However, if you have any suggestions on how to improve our Conda packages that would be most welcome.
Another way is installation directly from distro packages, but this is only supported at the moment for Ubuntu Bionic, Ubuntu Xenial and CentOS 7 (using Software Collections environment), however please bear in mind that this installs quite a few dependencies, as we support by default several storage systems for which we need the client libraries (e.g. Ceph, S3, XRootd, etc...):
pip3 install fs
wget http://packages.onedata.org/oneclient-2002.sh
./oneclient-2002.sh python3-onedatafs
Also, if you would like to just simply test the fs-onedatafs
you can start our Oneclient Docker image, where it is preinstalled:
❯ docker run --entrypoint /bin/bash -it onedata/oneclient:20.02.7
root@961e06e826b2:/tmp# python3
Python 3.6.9 (default, Jan 26 2021, 15:33:00)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from fs.onedatafs import OnedataFS
>>>
Finally, we are currently in the process of rewriting the public documentation at onedata.org so hopefully in few weeks the docs will be more user friendly and up to date...
Please let us know if you have any more questions or comments.
@bgruening Thanks for your interest!
First of all, the activity on our GitHub repositories is low due to the fact that the main development in Onedata happens through our self-hosted Jira+Bamboo suite, where we can run more comprehensive CI tests than on public CI platforms, and we simply push all release and develop branches daily to GitHub...
Ah I see. Maybe that can be added to the readme?
With respect to
fs-onedatafs
package specifically, this is just a Python wrapper over our C++ client, so this particular repository doesn't change too often, and the PyPI package was pushed just once for the sake of publishing docs, as thefs-onedatafs
package on its own is not usable, you need to have preinstalled the C++ client and libraries.
You could create a python wheel that contains your C++ client. This way other projects could depend on your Python bindings. Those wheels can be built on public CI and pushed to PyPI with every GitHub release.
I think it is confusing for users that see your package on PyPI, which is not working out of the box and old. Maybe better to not offer a package than? However, I will try to convince you that a nice PyPI package is useful :)
Currently, the preferred way to install is through Conda, however I'm currently struggling in making it work on Python 3.8+ as the conda dependency resolver fails on the dependencies before even starting a build. As to the
protobuf
andlibtbb
they are pinned in theoneclient
repository: https://github.com/onedata/oneclient/blob/release/20.02.7/conda/onedatafs/meta.yaml#L57-L64, at least since version20.02.6
. However, if you have any suggestions on how to improve our Conda packages that would be most welcome.
Do you think we can migrate those packages to (including the client) to conda-forge? This way we make sure that everything is consistent in the python-conda ecosystem. For example conda-forge pins the entire stack against a particular version of protobuf.
Another way is installation directly from distro packages, but this is only supported at the moment for Ubuntu Bionic, Ubuntu Xenial and CentOS 7
That is not really useful for us at the moment I think. For our project (galaxyproject.org) we would need PyPI packages. We could build them on our own (maybe), but we of course prefer upstream packages.
(using Software Collections environment), however, please bear in mind that this installs quite a few dependencies, as we support by default several storage systems for which we need the client libraries (e.g. Ceph, S3, XRootd, etc...):
Yeah, thats why we would like to use it and get EGI more tightly integrated into Galaxy.
pip3 install fs wget http://packages.onedata.org/oneclient-2002.sh ./oneclient-2002.sh python3-onedatafs
Oh, that seems nice, we will try that. Any change you can use to create python wheels?
Also, if you would like to just simply test the
fs-onedatafs
you can start our Oneclient Docker image, where it is preinstalled:❯ docker run --entrypoint /bin/bash -it onedata/oneclient:20.02.7 root@961e06e826b2:/tmp# python3 Python 3.6.9 (default, Jan 26 2021, 15:33:00) [GCC 8.4.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from fs.onedatafs import OnedataFS >>>
Finally, we are currently in the process of rewriting the public documentation at onedata.org so hopefully in few weeks the docs will be more user friendly and up to date...
Top! Looking forward to it.
Please let us know if you have any more questions or comments.
My only question are, if you are ok if we create conda-forge packages for it and if the project is willing to support python manylinux wheels on PyPI :)
Thanks a lot for your answer! Bjoern
@bgruening Thanks for the tips - we will look in the next weeks to check if it's possible to create a wheel package for OnedataFS and also try to port the conda recipes to conda-forge, and of course any support here would be welcome...
Awesome! Please ping me if you need any help!
@bkryza is there anything we can help here? It would be nice to give Galaxy users read-only access to EGI OneData :)
@bgruening The main problem in our case is the substantial amount of dependencies not available in conda or conda-forge: https://github.com/onedata/oneclient/blob/develop/conda/onedatafs/meta.yaml#L19-L65
I've actually made few attempts to fix linking with Py 3.8 on Anaconda and also to add these dependencies as Git submodules to the project and build like that but so far it didn't work...
One more thing I will try this week is to try to use the CMake FetchContent mechanism to enable alternative compilation of Oneclient and OnedataFS which will download and build these dependencies (mainly Facebook and AWS libraries) as static libraries and if it succeeds I will try to submit a conda-forge PR...
@bkryza do you have a list of those missing packages in conda-forge. I can try to get them in.
@bgruening The most critical are:
v2017.10.02.00
- it's crucial that they are exactly in this version.Optionally, we would need storage driver libraries which allow Oneclient/OnedataFS direct access to storage when possible:
So the bottomline is - if we had the FB libraries available or were able to build them in place during building of our packages - we could provide a first version with direct access to only selected storages which are covered by available libraries and then add new libraries if requested by users...
The question is - do you think it's viable to add the FB libraries in such old versions to conda-forge? - if not I will try to enable building them statically during compilation of our packages...
The question is - do you think it's viable to add the FB libraries in such old versions to conda-forge? - if not I will try to enable building them statically during compilation of our packages...
old libraries might be a problem :( indeed. Not sure what can of worms this is opening up.
@bkryza as a general question, is it possible to deactivate certain features during compile time? We could not include facebook-stuff in the first version and return a run-time warning if people will use it. Not ideal, but it brings us forward.
@bgruening We can disable different storages support using CMake flags - unfortunately the FB C++ libraries are critical as our core async code is built around them, and unfortunately they change their API quite often and we don't have time to update the code everytime...
I will try to work around it this week by adding a CMake flag which will fetch and build them during compilation...
Upps, I see. Thanks a lot @bkryza!
@bkryza any update here? We will ship the next Galaxy release unfortunately again without OneData support :(
@bgruening (CC: @luman75)
Actually there has been some progress, we've managed to get the latest stable version - 20.02.15 - to build and install (but only for Python 3.9). Most of the dependencies are now from conda-forge
, but still few are needed from our channel onedata
. I've just tested it to be sure on a fresh Miniconda install and it works, assuming your .condarc looks like this:
channels:
- conda-forge
- onedata
- defaults
then you can install fs.onedata
and oneclient
:
conda install fs.onedatafs=20.02.15
conda install oneclient=20.02.15
You can see the deps for oneclient
here:
and for fs.onedatafs
here:
Hi,
we would like to use and push the usage of onedata in our community. For that, we wanted to integrate fs.onedatafs into our system. However, it is not clear what the preferred way is to install this library.
Installing it via PyPI leads to the following error on python3.8 but also on python3.7.
Is python>3.7 supported? The PyPI release is really outdated and the GitHub release is much newer. Is there a reason? We tried the latest github release but we got the same error.
Next, we tried the anaconda release, but this does not work because you don't pin
protobuf
orlibtbb
so this release is also not useable out of the box. I'm part of the conda-forge and bioconda community so we could help with the correct setup of conda builds if you like.My general concern is that this project does not seem to be very active :( no issues, no PRs, the master branch is not updated and no CI infrastructure that runs tests. Is there any other preferred way to access OneData via python that I missed?
Thanks for OneData we like it a lot and would more tightly integrate with it but we would need to python library for that. Bjoern