Open KOLANICH opened 1 year ago
No dependencies should be vendored, inlined and submoduled. The code must be compatible to latest versions of each of its dependency. The mandatory features of each deps needed should be documented.
I agree with this. I've previously tried to make a conda package for symforce but wasn't successful due to various reasons. For example, I don't think we are able to use the latest fmt version (version 10) since it deprecated the ostream operator.
@chao-qu-skydio I am also interested in a conda package for symforce. Do you have a feedstock repo that you were working on that I may look at?
There is this closed PR from my previous attempt. https://github.com/conda-forge/staged-recipes/pull/21361
Generally I agree with a lot of this, but want to be very specific about what is desired here.
We currently distribute SymForce through PyPI, and not through any system package managers. We would love to also distribute SymForce through other means, and welcome contributions towards this, including Conda (#212 / https://github.com/conda-forge/staged-recipes/pull/21361), or package managers associated with various Linux distros. However, we aren't going to stop distributing through PyPI.
The standard for distribution of python wheels on Linux is PEP 600, as far as I'm aware. The PEP explains all of this and the reasoning better than I would regurgitate it here. Specifically this applies to your point (6) and handling of shared libraries.
No dependencies should be vendored
We currently vendor* exactly two dependencies that are available elsewhere, symengine
and symenginepy
(eigen_lcm
and skymarshal
are other Skydio projects that we've open-sourced with SymForce and only live in symforce/third_party
). symengine
and symenginepy
are somewhat heavily modified, SymForce does not work with stock distributed SymEngine. If we could use them unmodified, we wouldn't vendor them.
*I define vendor
as "include the source in the SymForce repo"
No dependencies should be inlined and submoduled
I'm not sure exactly what you mean here. I assume you're referring to use of FetchContent to pull in libraries that are not already installed on the system, but this is not mandatory for any of our dependencies, you're welcome to compile SymForce against installed versions of all of those things. I think maybe the one exception is METIS, but we could make that work as well. Generally, I'm not sure what the negative consequences are of this specifically.
The code must be compatible to latest versions of each of its dependency
I don't think anyone would disagree that this is desirable. Feel free to contribute compatibility with newer releases of any dependency you'd like, or call out specific released versions of dependencies that are important / blocking, otherwise we'll do this as we can / as it's needed for other reasons. The main one we're aware of is fmt
/ spdlog
, which already has a PR #326 , discussion should happen there or on a separate issue. The other one is probably SymEngine, which we'd like to upgrade, but is complicated for the reasons mentioned above and below, and isn't a compatibility issue in the same way since we are only compatible with our modified version of SymEngine, which is built and shipped with SymForce.
Primarily, I think one thing we could do here is distribute our copy of SymEngine on its own, and not build our copy of SymEngine as part of the SymForce build - it's by far the most bug-prone and convoluted part of our build. Plus, the majority of people building SymForce from source are not modifying SymEngine. The tradeoff is that it makes the build process more annoying for people who are modifying SymEngine, and requires we track and depend on the specific version of our copy of SymEngine associated with the current version of SymForce, but that's probably worth it.
A SymEngine maintainer here. What modifications of SymEngine are you referring to here? Are they upstreamable?
They are! (at least most of them) We would love to upstream them.
The "biggest" one in terms of code changes is making the type of Add::dict_
configurable, so you can use a std::map
or other container that's deterministic across compilers/toolchains/stdlib implementations, from talking with Ondrej it sounded like yall are aware of this nondeterminism but I haven't looked at upstream symengine in a while to know if that's already configurable there. That one is cbb26185b2a6d2e993bdbe145926bfa8e86e8095 in symforce.
1e5d381d5a279fc3707a4447715be881fb548f96 has other misc derivatives and functions, which are mostly upstreamable assuming yall haven't already added them. There are some minor assumptions there that we'd want to refine (e.g. for reals vs complex).
And then there are some other assorted changes, the full commit log for git log third_party/symengine*
(symengine and symenginepy) is here:
log.txt
The "biggest" one in terms of code changes is making the type of Add::dict_ configurable, so you can use a std::map or other container that's deterministic across compilers/toolchains/stdlib implementation
We do know that they are non-deterministic. However, that should be an implementation detail and the user should not be affected by that. For example, when we print the expression we sort them, so that the user is not affected by this implementation detail. If a user is affected, I would consider that a bug. (How to fix any bug without sacrificing performance might be hard, but probably not impossible)
It's not configurable at the moment, but I don't think anyone would mind making it configurable. However, you'll have to ask distro managers to use that configure option to make symforce work with symengine in distro packages.
Other changes look good from cursory look.
We do know that they are non-deterministic. However, that should be an implementation detail and the user should not be affected by that. For example, when we print the expression we sort them, so that the user is not affected by this implementation detail. If a user is affected, I would consider that a bug. (How to fix any bug without sacrificing performance might be hard, but probably not impossible)
We were definitely seeing effects, our generated functions were different. We could try and put together a minimal example that shows this if yall are interested in digging into that? It's possible we could add sorts in the appropriate places, we didn't attempt this. It'd definitely already be a big step in the right direction for us to be able to just build upstream symengine, or even symengine with some much smaller patches, with some compile flags, even if it doesn't take us all the way to being able to use an already-distributed copy.
We won't necessarily have time soon to actually pull things out and make PRs to upstream symengine, if someone on the symengine team or anyone else is interested in doing that that would be awesome, and we're happy to assist - otherwise we'll get around to it eventually
We could try and put together a minimal example that shows this if yall are interested in digging into that?
Yes, please.
Is your feature request related to a problem? Please describe. Given the grave bugs related to shared libs building and installation (#329, #330, #332, #333 at least) and taking into account other negative consequences of bundling libraries:
it is proposed to change the way this lib is built and installed.
Describe the solution you'd like
lib*
name containing the shared lib itself and only it, andlib*-dev
(Debian convention) andlib*-devel
(RPM-based distros convention) SDK package.setup.py
builds only the cext for Python and links to the shared libs installed into the system. It DOESN'T bundle any other libs.Describe alternatives you've considered There is no alternatives other than keeping the current mess.