Open fabcor-maxiv opened 8 months ago
Hey @fabcor-maxiv, is this still draft ?
@marcus-oscarsson Yes, this led me to uncover some more inconsistencies that I need to find a solution for. And as you can see the pytest workflow fails (for not entirely clear reasons). In short, having both conda and Poetry is problematic, at least in my opinion. Obviously it does not seem to be blocking anyone from their work, so I want to take my time with this.
@fabcor-maxiv the issue I think is this Cannot install python-ldap.
and I think its because the system libraries for ldap is missing. Its basically these libldap2-dev libsasl2-dev slapd ldap-utils
.
@fabcor-maxiv the issue I think is this
Cannot install python-ldap.
and I think its because the system libraries for ldap is missing. Its basically theselibldap2-dev libsasl2-dev slapd ldap-utils
.
@marcus-oscarsson
The way I understand this is as follows:
python-ldap
(and as far as I can tell, it is the only dependency in this case, there are wheels for everything else we depend on). So python-ldap
needs to be built locally, which requires a compilation step, which requires a compiler, system libraries, and headers.python-ldap
is installed, we can let Poetry do the rest. But...python-ldap
installed by conda:
python-ldap
in conda-environment-dev.yml
and in poetry.lock
, which I fixed in https://github.com/mxcube/mxcubecore/pull/851python-ldap
it is very likely an issue with a lot of conda packages.Which means that pre-installing python-ldap
with conda is probably useless, and we could even get rid of conda entirely since I am not sure why we need conda for if not for python-ldap
.
It seems like you've understood the situation very well.
We used to use apt-get
or similar to install the libraries required by ldap
, to make things a bit easier to install we then added these into the conda environment which have been working well with pip
and I have not noticed any issue with poetry
but thats maybe due to some other fortunate circumstance.
It would of course be very nice if we could remove the conda environment altogether. Looking at the error message from the test run it looks like the issue still is the installation of python-ldap
.
Why do you think its right that poetry
ignores the python-ldap
installed by conda
I guess we could make python-ldap
optional and the sites use LDAP have to make sure that the right libraries are in place before installing.
I may have misunderstood things, but my impression was that when using Conda, Poetry, too, installed its code in the Conda environment. If Conda is removed, where will Poetry then install? Hopefully not in the main OS - it would be a real mess to have new installations override previous ones, aprticularly when you are testing different branches on the same machine.
I may have misunderstood things, but my impression was that when using Conda, Poetry, too, installed its code in the Conda environment. If Conda is removed, where will Poetry then install? Hopefully not in the main OS - it would be a real mess to have new installations override previous ones, aprticularly when you are testing different branches on the same machine.
By default Poetry always uses a virtual environment. If Poetry can not find an active environment (be it a conda, venv, virtualenv, or anything else), then it will create one. When you create the conda environment, activate this conda enviroment, then run a Poetry command such as poetry install
, then Poetry does install the packages in the conda environment.
The logic of managing environments is different with Poetry than with conda. With Poetry alone, I am not sure how I would handle one virtual environment per git branch. It is probably possible but not straightforward. It is not something I do or want to do so I have never investigated it.
Why do you think its right that
poetry
ignores thepython-ldap
installed by conda
Documented in these tickets:
We used to use
apt-get
or similar to install the libraries required byldap
, to make things a bit easier to install we then added these into the conda environment which have been working well withpip
and I have not noticed any issue withpoetry
but thats maybe due to some other fortunate circumstance.
It is likely that you still have the (apt-get
) system dependencies installed on your system(s), or you have the built wheels of python-ldap
in some cache somewhere.
I guess we could make
python-ldap
optional and the sites use LDAP have to make sure that the right libraries are in place before installing.
Oh, right, LDAP itself might not even be used by all facilities. But that is a different topic, kind of orthogonal, and we find ourselves at the intersection of both. Even if we were to get rid of python-ldap
entirely, I believe it would still be better to sort out the conda vs. Poetry topic.
It would of course be very nice if we could remove the conda environment altogether.
Everyone should be free to use conda if they want to (we will likely keep using conda for our deployments at MAXIV), but I do not think it is strictly necessary to have our (official, documented) installation procedures mention conda. We should at least make sure that everything works fine without conda and our documentation should reflect that.
But...
We could also say clearly that: "yes we want to keep going with conda for reasons X and Y". I would be fine with that, and I have some ideas on how to make it work with Poetry, but for sure it would add a bit of complexity, and so far I do not know of any such X or Y reason.
For now my understanding is that we only get the impression of things going well but they do not actually go well as can be seen in this failing job where Poetry tries to install python-ldap
and fails to do so, even though we had already installed python-ldap
in the conda environment.
Seems like you did a fair amount of digging :), very nice.
So the issue is the python-ldap
package and seemingly some routine within the conda build and not that its installed by conda per say. I misinterpreted what you posted as that poetry rightfully ignores installed conda packages, which I found a bit strange.
For the rest I completely agree, making python-ldap
optional is not a solution to the issue and I also rather see it solved. Removing the conda environment yml would make the install procedure even simpler which is always good. Then, anyone is (as today) free to use whatever virtual environment solution they prefer. We are using conda and it mostly works very well (but can sometimes give you some real trouble).
My opinion however, is that we should as a community recommend or at least suggest a virtual environment solution that works. Most developers and sites will need one and I do think its good if its documented somewhere, why not in the install section ? (or perhaps at least somewhat in a related section)
A couple more things to take into account...
Poetry is a development tool. So doing deployment in production with poetry install
is not really something one should do (it's like installing gcc and your IDE on the production machine). I am a bit uncomfortable with our (mxcubeweb) documentation suggesting this.
Also Poetry should never be installed in the same environment it is meant to be installing packages in. But currently, if one uses our conda-environment-dev.yml
then this is exactly what happens. Poetry gets installed in the conda environment and then Poetry installs mxcubecore
in that same conda environment. We end up having mxcubecore installed side by side with Poetry. This is not how it is supposed to be. In the environment there should be only mxcubecore and its runtime dependencies (including Python).
we should as a community recommend or at least suggest a virtual environment solution that works
Yes, that is what I want to do as well. It will take me some time to come up with a good solution. And also it is not my current priority. But as far as I know it is not a blocker for anyone, so it should be fine. Maybe before the next code/doc camp would be nice, though.
I don't think there is any urgency to this it seems to work quite alright as it is even if its not ideal. Anytime you have to spend on this is very much appreciated :)
Do we mention something about how to install/deploy in a production environment ?. We talked about it but, the documentation we have is more for development and testing not so much for deployment its so far left up to each institute to deal with. It would be great if we could come up with something though.
But just out of curiosity and its not a discussion for this PR, but how do you deploy or install a python package without pip
or poetry
.
Do we mention something about how to install/deploy in a production environment ?. We talked about it but, the documentation we have is more for development and testing not so much for deployment its so far left up to each institute to deal with. It would be great if we could come up with something though.
True. I guess what I wanted to say is that since the only documentation regarding installation that we have is for development, people might assume that it is applicable for production as well. I should try to write something about production deployment when I get to it (maybe at the code camp).
But just out of curiosity and its not a discussion for this PR, but how do you deploy or install a python package without
pip
orpoetry
.
I do not think I said pip should not be used. I would recommend pip or conda. Also, interesting fact, nowadays, pip is able to install in any environment, not necessarily the one pip itself is installed in. So we should be able to pip-install mxcube in an environment that is completely empty (not even pip), which would probably be the ideal case for production.
I see, I somehow thought that one could use poetry to install from pypi as one does with pip but it does indeed not seem to be the case, so for that matter I but an equal between the two. We have so far been using pip for our deployment so I did not even try with poetry for that purpose.
Is this one sill in progress ?
Do we mention something about how to install/deploy in a production environment ?. We talked about it but, the documentation we have is more for development and testing not so much for deployment its so far left up to each institute to deal with. It would be great if we could come up with something though.
True. I guess what I wanted to say is that since the only documentation regarding installation that we have is for development, people might assume that it is applicable for production as well. I should try to write something about production deployment when I get to it (maybe at the code camp).
Done:
Is this one sill in progress ?
I still plan to work on this. I need to find the time.
Ok, sounds good, time is a precious resource these days :)
Use conda environments for better consistency