Closed jlopezpena closed 8 months ago
(CI lint failures are unrelated to this PR)
@Jasha10, this is a breaking change (that would also require a parallel fix to Hydra.
In the mean time users with specific antlr version needs can build their own version of OmegaConf.
How should users (like myself) do that, in a way that it works with various dependencies declaring omegaconf
in their metadata?
(To keep the comment on-topic: from the peanut gallery, looks like this is effort well spent on a short term solution while folks address the root cause)
@omry what is the root issue that needs to be fixed, and is there any way I can help with that?
I submitted the corresponding PR to hydra
(in draft state as it requires a new version of omegaconf
).
How should users (like myself) do that, in a way that it works with various dependencies declaring
omegaconf
in their metadata?
@astrojuanlu You can do that by having your own dependency repository (kind of a mirrored version of PyPi) where you make specific builds of packages catered to your needs. Then when you declare dependencies you specify your mirror as the primary place to find dependencies and PyPi as a fallback, or explicitly declare dependencies to be pulled from your mirror.
Eg with poetry
you can add something like this in your pyproject.toml
:
[[tool.poetry.source]]
name = "PyPI"
priority = "default"
[[tool.poetry.source]]
name = "your_source"
priority = "supplemental"
url = URL_TO_YOUR_MIRROR
and when you declare a dependency instead of doing
omegaconf = 2.3.0
you do something like
omegaconf = {version = "2.3.0", source = "your_source"}
The problem with this approach is if you are working on a library that needs to be used by other people in their project, because then you are forcing THEM to also do the same thing (or you to vendor the modified package source for omegaconf
and include it in the code you distribute)
The problem with this approach is if you are working on a library that needs to be used by other people in their project, because then you are forcing THEM to also do the same thing
Yeah that's what I had in mind, ideally we should be able to do this in a way that transitive dependencies don't have to cooperate. Otherwise I don't see how users can cope with the current situation effectively.
Hello, just wanted to clarify what the status of this PR is. Does the comment by @omry mean that this PR will be blocked and we will need to find another way to make omegaconf
compatible with different antlr4-python3-runtime
versions (without upgrading to antlr4-python3-runtime==4.11.*
)?
If this is the case, I would be happy to help with finding the root cause and working on a PR to address it.
I think it would be good to at least discuss @omry's points before merging.
@Jasha10, this is a breaking change
Yes, we are working on that: https://github.com/facebookresearch/hydra/pull/2733
- News fragment.
Done! (commit https://github.com/omry/omegaconf/pull/1114/commits/bc4949b9a0995495a578c7712d24df764c4618f3)
- I think it's really best if the effort is focused on fixing the root issue instead of bumping the version like this.
Any ideas for fixing the root issue? The antlr version is tightly coupled to the minor version of antlr4-python3-runtime
.
In the mean time users with specific antlr version needs can build their own version of OmegaConf.
Here is a recipe for users who need to build their own version of OmegaConf+Hydra: https://github.com/facebookresearch/hydra/issues/2491#issuecomment-1344306280
My opinion is that we should go forward with this PR.
I haven't looked at this in details, but would it be possible by any chance to make Hydra / OmegaConf compatible with multiple versions of antlr? (compiling the grammar during pip install
instead of shipping the pre-compiled files)
would it be possible by any chance to make Hydra / OmegaConf compatible with multiple versions of antlr? (compiling the grammar during
pip install
instead of shipping the pre-compiled files)
The current workflow for compiling the grammar invokes the following command:
java -jar build_helpers/bin/antlr-4.9.3-complete.jar -Dlanguage=Python3 -o omegaconf/grammar/gen -Xexact-output-dir -visitor omegaconf/grammar/grammar
This requires the antlr .jar
to be present when the grammar is compiled.
If we compile the grammar during pip install
then I think we would have to either (1) ship multiple antlr .jar files and select one of them at install time, or (2) expose an option for the user to provide their own antlr dependency.
Option (2) has been suggested before (here). I'm not sure specifically how we would expose such an option at pip install
time -- maybe via an extras option?
It just so happens that I've run across a dependency problem involving this exact issue as well. Seems like a tricky issue to solve.
I think the best solution would be to "vendor" OmegaConf with a specific version of antlr, and have Hydra depend on it. That would require some rewriting some imports in the generated code and would essentially eliminate the external dependency on antlr.
I think I outlined it at some point as one of the options for a permanent fix for this issue.
Could an option be to replace the anttrl4
java binary by antlr-tools?
Could an option be to replace the
anttrl4
java binary by antlr-tools?
I don't see how that would do anything. The antlr java binary is just a matter of convenience during the build of OmegaConf. The problem is that the antlr Python runtime only works with generated code that is of the exact same version used to generate the parser.
I see, so the binary dependency is not runtime, but build-time only. I can think of two potential solutions to the bigger issue then.
One would be to build the grammars at install time. The downside of that is that the antlr binary would need to be downloaded at install time (probably through antlr-tools to avoid needing a system wide install of java). This would add size and time to install, as well as require an active internet collection.
The second approach would be to build multiple versions of the grammars at build time (through multiple versions of the antlr binary), and select the grammar at runtime based on the installed version of antlr4-python3-runtime
. This would effectively decouple the build dependency from run time, at the cost of having additional grammar files distributed with the wheel and needing the (slightly) more complex logic for dispatching the correct version.
I believe the second option would be nicer from the user's point of view, with the tradeoff of having a more complex build process (which might or might not be acceptable for omegacong
/hydra
maintainers)
I think @omry's suggestion of shipping a specific version of antlr4-python3-runtime
within OmegaConf may be the most robust approach.
I see, so the binary dependency is not runtime, but build-time only.
Not quite. The antlr runtime is checking that it's working with code generated by a generator from the exact same version. This is a problematic design choice, as in reality antlr is very stable and they could be much laxer with this check - but I digress.
Other low-level libraries, like pip itself, solves the problem of dependency conflicts by optionally including their dependencies in under a different module. They call this vendoring (pip page about it). The idea is that by including your own version of the dependency, you isolate your dependencies from the dependencies of userland apps and eliminating these kinds of issues.
The complication is that Hydra should probably follow whatever OmegaConf is doing here.
I understand the benefit of vendoring, the downside of that is that by vendoring a third party pakage you are left with the burden of incorporating security updates and whatnot yourself - which might or might not be a big ordeal depending on how often there are changes that need to be incorporated, and how complex updating the vendoring is (which might be as simple as running a git submodule update
or a much more complex one). That's why in my own projects I tend to shy away from vendoring and normally opt for something similar to the second option I mentioned about (this is also the approach taken in many conda recipes that require binary linking against a specific version of numpy, resulting in a different "precompiled" binary for each python/numpy/architecture combination).
As mentioned, if there is anything I can do to help solve the issue let me know, the final decision on what the right solution is of couse falls onto the core developers, as they are the ones that will be left with the maintenance burden.
Is there a final decision on what the way forward should be? We are discovering more incompatibilities with other dependencies.
Is there a final decision on what the way forward should be? We are discovering more incompatibilities with other dependencies.
I think shipping our own version will be easier overall. We just need someone to actually implement it :)
I think shipping our own version will be easier overall.
I guess you mean @omry's vendoring idea from https://github.com/omry/omegaconf/pull/1114#issuecomment-1699656976
We just need someone to actually implement it :)
👍🏽 Thank you!
I can change the PR to vendor antlr4-python3-runtime
if that is the agreed approach, but I will not commit to maintain a vendored dependency.
For whatever is worth, I don't think vendoring is the right solution. It is an anti-pattern and introduces a significant burden to maintenance and a lot of friction for development. IMO it is only justified for packages like pip
that need to be bootstrapped.
For whatever is worth, I don't think vendoring is the right solution. It is an anti-pattern
@jlopezpena, thank you for sharing your opinion about this. It seems you are advocating for the second approach from your comment above: building "multiple versions of the grammars at build time (through multiple versions of the antlr binary), and select the grammar at runtime based on the installed version of antlr4-python3-runtime." I am concerned that this approach would make omegaconf's build and release process more difficult.
Currently, omegaconf's build & release process is very simple (here are the full instructions from the docstring of setup.py
):
OmegaConf setup
Instructions:
# Build:
rm -rf dist/ omegaconf.egg-info/
python -m build
# Upload:
twine upload dist/*
I worry that the proposal to build multiple versions of the grammar would significantly complicate this process. Meanwhile, the vendoring appraoch could be as "simple" as copying the antlr4-python3-runtime
source into omegaconf's source tree, allowing us to keep the same build procedure. Does this make sense to you?
by vendoring a third party pakage you are left with the burden of incorporating security updates and whatnot yourself - which might or might not be a big ordeal depending on how often there are changes that need to be incorporated.
Note that, with our current approach, we have not been updating our pin on the antlr4 runtime very frequently. Likewise, I do not anticipate that we would update a vendored version of the antlr4 runtime frequently.
Meanwhile, the vendoring appraoch could be as "simple" as copying the antlr4-python3-runtime source into omegaconf's source tree, allowing us to keep the same build procedure. Does this make sense to you?
I don't think it will be that simple. It will also require editing the source of all the vendored antlr4-python3-runtime
files to fix the import locations, otherwise if a user has an existing antlr4-python3-runtime
installed (which is but a certainty given the reason many people are asking to update this is due to a dependency clash with other packages that use antlr) any internal imports from the vendored package might end up pulling (some) code from the user's antlr4 instead of the vendored one.
If done "manually" then any potential update will require doing the same process all over; the "right" way to do this would be to set up a script that would download (a specific version of) the runtime, copy it over the desired target folder (vendor/antlr4
or whatever) and then patch all the vendored files replacing all the imports, eg every from antlr4.FOO import BAR
by from omegaconf.vendor.antlr4.FOO import BAR
, every import antlr4
to import omegaconf.vendor.antlr4 as antlr4
, etc etc, and ideally making sure the antlr4
tests still run after the patching, and potentially adding some additional integration tests that verify that nothing goes wrong when there is a different runtime version in user space that doesn't agree with the vendored one. Usually I'd expect having to go over the results of this process and do some additional tweaking to make things work, depending on how convoluted the code is.
Nothing in this process is particularly challenging or ground-breaking, but it normally is a tedious and error prone process
I created a helper script get_vendored
that will do the following:
omegaconf/vendor
folder (apart from init file, requirements file, and readme)omegaconf/vendor/vendor.txt
inside the omegaconf/vendor
folderbin
folder)import bar -> import foo.vendor.bar
from bar -> from foo.vendor.bar
Running this script from top level (eg python build_helpers/get_vendored.py
will get the source for whatever antlr4 version is specified in vendor.txt
, and apply the transformations mentioned above.
Still to be done:
omegaconf
to use the vendored antlr4
omegaconf
still worksantlr4
installedantlr4
source files to the repo or get them downloaded and modified through the script at build time@Jasha10 what is your opinion on the last one? I normally prefer not committing automatically generated code, but running the script at build time will require pip
as well as network access, which could make it hard to run in sandboxed environments, so I think we could push the sources as well (with a big warning in the README.txt` file that those files get automatically created and modified by a script and should not be hand-edited)
Test suite passes for me locally with vendored antlr4
and the patch to the parser generation class:
To get CI to run properly we need to decide whether we do the vendoring as part of the build or as a manual step that gets committed.
Did a tentative commit pushing the vendored files to be able to test CI. There are many checks failing, mostly due to import errors that seem to come from a mix of absolute and relative imports within the codebase of omegaconf
, which doesn't play nicely with the vendoring. I have tried some things with mixed results but didn't find a proper solution. Won't be able to work further on this for a day or two, in case someone wants to take over in the meantime
Thanks @jlopezpena, unfortunately no time for me to work on OmegaConf this week.
To get CI to run properly we need to decide whether we do the vendoring as part of the build or as a manual step that gets committed.
I would just do a manual commit, seems simpler that way.
Decide if we want to commit the antlr4 source files to the repo or get them downloaded and modified through the script at build time @Jasha10 what is your opinion on the last one?
Thanks again for all of the discussion and for working on the lead with implementation, @jlopezpena. I agree with @odelalleau about committing the antlr4 source files to the repo.
I think this is pretty much done. There are a bunch of flake8
errors in tests, but those seem to have nothing to do with this PR, and should be fixed somewhere else. @odelalleau , this seems related to https://github.com/omry/omegaconf/pull/1125
To minimise conflicts, I made all the imports in the vendored library relative (with some changes to the regexes in the get_vendored
script to make that work).
I also excluded all the vendored files from CI checks like coverage, linting, mypy, etc, as those checks are not enforced in antlr4
and fixing the sources in here would make them diverge from the original for no real benefit.
@Jasha10 a new review would be a good idea, as this is now too different from the originally approved PR
There are a bunch of flake8 errors in tests
Could you please rebase this PR against a recent version of the master
branch?
I believe those flake8 errors are fixed by https://github.com/omry/omegaconf/pull/1109, which was merged into master
after you opened this PR.
All green after rebasing!
For reviewing purposes I recommend ignoring everything under omegaconf/vendor/antlr4
, and check the rest of the files. Deleting the antlr4
folder and running the build_helpers/get_vendored.py
script should recreate everything on that folder with no changes from what is on this PR
Is there a way to tell CodeQL to ignore the omegaconf/vendor
folder?
Not sure -- I don't know a lot about how to configure CodeQL. Anyway, I've gone through and manually dismissed the CodeQL alerts related to the vendor folder.
I feel there isn't much else to do with this PR apart from merging, but there are some items that should be discussed with the view of moving this forward and unblock the dependency lockdown (DVC now includes hydra, so I am getting hit by this even in projects where I don't explicitly use hydra myself now).
The vendoring will take care of ensuring that the correct runtime will be used to load the grammars, but there are still some steps that would be either nice to have or required to move forward:
antlr4
binary jar file is added to the repo manually. One needs to make sure that the binary file present agrees with the runtime version listed in vendor.txt
in this PR. Ideally, the get_vendored
script would also pull the jar binary file to ensure the binary and runtime versions cannot get out of sync. Some inspiration for that can be taken from antlr4-tools (probably not worth pulling the entire dependency)omegaconf
and hydra
some further work needs to be done in both projects. hydra
also needs to run the antlr4 binary at build time to create its parsers, and the runtime used needs to agree with the version used there AND the version used in omegaconf
. A potential solution here is to include the antlr4 jar file in the omegaconf
wheel (maybe not exposing it directly, but providing a script entrypoint for calling it), and let hydra use that script/binary instead of its own antlr4 binary, and then leverage omegaconf's vendored runtime instead of adding the dependency again.For the second point, the proposed solution probably entails making omegaconf
a hard-versioned dependency of hydra
. Going forward this will mean that each time omegaconf
gets an update that touches antlr4
, a new version of hydra
pinning the new release as a minimum version will need to be done as well. Because changes in antlr4 binary/runtime will be API breaking (-ish) for hydra
, to ensure good semantic versioning omegaconf
might want to adopt a policy requiring that any changes to the antlr binary/runtime must be released with a major version bump (i.e. the current PR should be released as omegaconf
3.0, not 2.4).
I believe hydra and omegaconf versions are heavily interlocked already, so hopefully this will not bring much of an additional maintenance burden (albeit whenever vendoring some pain is unavoidable).
PS: Adding the antlr binary to the omegaconf wheel will increase wheel size by about 10MB, and the binary will be unsuable unless a proper Java runtime is present. The increased wheel size might be an inconvenience for third-party projects that depend on hydra/omegaconf so perhaps it would be a good idea to make it into an optional dependency that third party libraries can opt-out of if desired (hydra would still need to pull the full wheel at build time)
Because changes in antlr4 binary/runtime will be API breaking (-ish) for
hydra
, to ensure good semantic versioningomegaconf
might want to adopt a policy requiring that any changes to the antlr binary/runtime must be released with a major version bump (i.e. the current PR should be released asomegaconf
3.0, not 2.4).I believe hydra and omegaconf versions are heavily interlocked already, so hopefully this will not bring much of an additional maintenance burden (albeit whenever vendoring some pain is unavoidable).
Yeah that makes sense, and it should be fine. Note however that:
what is the plan to address this in Hydra?
In Hydra we can change import antlr4
to from omegaconf.vendor import antlr4
.
what is the plan to address this in Hydra?
In Hydra we can change
import antlr4
tofrom omegaconf.vendor import antlr4
.
There is still a potential issue in that we need to make sure the binary antlr jar in hydra needs to be the same version as the vendored runtime in omegaconf (but AFAIK that is already the case anyway)
what is the plan to address this in Hydra?
In Hydra we can change
import antlr4
tofrom omegaconf.vendor import antlr4
.There is still a potential issue in that we need to make sure the binary antlr jar in hydra needs to be the same version as the vendored runtime in omegaconf (but AFAIK that is already the case anyway)
The build of Hydra would need to change similarly to the build of OmegaConf to rewrite the generated code as well.
The build of Hydra would need to change similarly to the build of OmegaConf to rewrite the generated code as well.
I can either port the regexes that to the replacements to the grammars (the fix_import
method I added) to hydra, or refactor the one in omegaconf
and leverage that from hydra
; since all the replacements are done programatically, implementation will be easy either way. I will wait to update the hydra
PR until this one settles, to avoid having to work any potential changes across both repos.
What I don't have a good solution for is making sure the antlr4 java binary (which is used at build time, but not distributed in the wheel) stays the same for both packages. The only way I can think of enforcing that would be to download the jar
file at build time, which comes with its whole set of complications and downsides.
But since this issue is already present anyway, and this PR is not making it any worse than it already is, I'd rather leave that for future improvements
Sounds good. For simplicity, I think you can copy the code. No need to include the build code related code with OmegaConf for that purpose. The way to control OmegaConf dependency in Hydra (and by extension, it's dependencies) is by pretty strict version dependency of OmegaConf from Hydra.
https://github.com/facebookresearch/hydra/blob/main/requirements/requirements.txt#L1
Hey folks! Thanks for the upgrade and vendoring! I'm running into a similar issue to hyrda on a seperate project with antlr dependencies.
I've been mostly successfully using this branch today, however I've hit a few snags trying to install it.
Mainly, if I use anything other than an editable installation, I get this import error:
ModuleNotFoundError: No module named 'omegaconf.vendor.antlr4.error'
I've tried:
python setup.py build
+ pip install .
python setup.py sdist
+ pip install dist/omegaconf-2.4.0.dev1.tar.gz
python setup.py antlr
+ pip install -e .
(with only this option succeeding)I'm not sure I've successfully emulated what will happen during an actual CI build, but I have checked and confirms that the subdirectories of the omegaconf/vendor
we not present in the install.
I didn't see them in the checks, but is there a package I could manually install from CI while still on this branch?
Thanks!
I don't know if it's related but note that direct setup.py
invokation on the CLI is deprecated, see https://packaging.python.org/en/latest/discussions/setup-py-deprecated/#what-commands-should-be-used-instead
~Ah, I think the setup.py
might be using explicit package listing instead of auto-discovery. I will add the vendored folder to the package list, see if that helps~ Scratch that, I had already added "omegaconf.vendor"
anf "omegaconf.vendor.antlr4"
to the package list in setup.py
, so they should be discoverable
I can reproduce the problem. The files in vendor/antlr4
are getting copied over, but folders inside that are not. Will try to work on a fix later today or tomorrow. Thanks for flagging it up @mawildoer !
@mawildoer the non-editable installs should work now, could you give it a check?
@Jasha10, @omry it looks like CI is only testing editable installs, perhaps a non-editable scenario should be added as well (in a separate PR)
I don't know if it's related but note that direct
setup.py
invokation on the CLI is deprecated, see https://packaging.python.org/en/latest/discussions/setup-py-deprecated/#what-commands-should-be-used-instead
TIL; those are the old-school alias for the -e
we know and love! I've only ever used "speciality" setup.py
commands once before, for something more analogous to the antlr
case there (eg. doing something special). I just saw them this time in the setup.py
and figured it was worth giving a whirl, since other things weren't working yet. Thanks for the comment!
@mawildoer the non-editable installs should work now, could you give it a check?
Yessir! Tested both pip install -e .
and pip install .
. Both working 🎉 w/ full IDE/pyright/pylint etc... finding them too.
Thanks for the quick response! I'll have to package and distribute your branch if we can't get this back to mainline soon 😅
Thanks for all the work!
@jlopezpena I've just uploaded omegaconf dev release 2.4.0.dev1
to pypi based on this PR branch. This should be useful so there's something to pin against when working on the Hydra PR.
https://pypi.org/project/omegaconf/2.4.0.dev1/
pip install omegaconf==2.4.0.dev1
@jlopezpena I've just uploaded omegaconf dev release
2.4.0.dev1
to pypi based on this PR branch. This should be useful so there's something to pin against when working on the Hydra PR.
@Jasha10 excellent, thank you very much! I made a commit to the hydra PR including the regex import fixing for the generated parsers, will change it to use the 2.4.0.dev1
version of omegaconf
and start testing on Monday.
Bumping
antlr
binary andantlr4-python3-runtime
in order to unlock doing the same thing inhydra
, following this discussion