hedronvision / bazel-compile-commands-extractor

Goal: Enable awesome tooling for Bazel users of the C language family.
Other
659 stars 109 forks source link

ModuleNotFoundError: No module named 'orjson.orjson' #165

Open mvukov opened 7 months ago

mvukov commented 7 months ago

https://github.com/mvukov/optimus/pull/81:

This is what I get on my local machine w/ Ubuntu 22.04 (I also got the same error or another machine w/ the same OS).

bazel run //:refresh_compile_commands
INFO: Analyzed target //:refresh_compile_commands (77 packages loaded, 619 targets configured).
INFO: Found 1 target...
Target //:refresh_compile_commands up-to-date:
  bazel-bin/_refresh_compile_commands
  bazel-bin/refresh_compile_commands.py
INFO: Elapsed time: 0.698s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Running command line: bazel-bin/refresh_compile_commands
Traceback (most recent call last):
  File "/home/user/.cache/bazel/_bazel_user/bc887fb8be44eda01eb72d2d7b5a5d13/execroot/_main/bazel-out/k8-fastbuild-ST-647a15afa184/bin/refresh_compile_commands.runfiles/_main/refresh_compile_commands.py", line 20, in <module>
    import orjson # orjson is much faster than the standard library's json module (1.9 seconds vs 6.6 seconds for a ~140 MB file). See https://github.com/hedronvision/bazel-compile-commands-extractor/pull/118
  File "/home/user/.cache/bazel/_bazel_user/bc887fb8be44eda01eb72d2d7b5a5d13/execroot/_main/bazel-out/k8-fastbuild-ST-647a15afa184/bin/refresh_compile_commands.runfiles/hedron_compile_commands_pip_orjson/site-packages/orjson/__init__.py", line 3, in <module>
    from .orjson import *
ModuleNotFoundError: No module named 'orjson.orjson'
btalb commented 7 months ago

I'm also seeing this on 22.04

btalb commented 7 months ago

Hack to use json instead to get around this until fixed:

bazel_dep(name = "hedron_compile_commands", dev_dependency = True)
git_override(
  module_name = "hedron_compile_commands",
  remote = "https://github.com/hedronvision/bazel-compile-commands-extractor",
  commit = "09fee2fd2082b17baa9582ca2b2d63fe19caa294",
  patch_strip = 1,
  patches = ["hedron.patch"],
)

where the patch file is:

diff --git a/refresh.template.py b/refresh.template.py
index 359c9bf..5d02ffa 100644
--- a/refresh.template.py
+++ b/refresh.template.py
@@ -17,7 +17,7 @@ import functools
 import itertools
 import json
 import locale
-import orjson # orjson is much faster than the standard library's json module (1.9 seconds vs 6.6 seconds for a ~140 MB file). See https://github.com/hedronvision/bazel-compile-commands-extractor/pull/118
+import json as orjson # orjson is much faster than the standard library's json module (1.9 seconds vs 6.6 seconds for a ~140 MB file). See https://github.com/hedronvision/bazel-compile-commands-extractor/pull/118
 import os
 import pathlib
 import re
@@ -572,8 +572,8 @@ def _get_headers(compile_action, source_path: str):
         with open(cache_file_path, 'wb') as cache_file:
             cache_file.write(orjson.dumps(
                 (compile_action.actionKey, list(headers)),
-                option=orjson.OPT_INDENT_2,
-            ))
+                indent=2,
+            ).encode())
     elif not headers and cached_headers: # If we failed to get headers, we'll fall back on a stale cache.
         headers = set(cached_headers)

@@ -1392,5 +1392,5 @@ if __name__ == '__main__':
     with open('compile_commands.json', 'wb') as output_file:
         output_file.write(orjson.dumps(
             compile_command_entries,
-            option=orjson.OPT_INDENT_2,
-        ))
+            indent=2,
+        ).encode())
cpsauer commented 7 months ago

Phew, any idea what's going wrong under the hood, guys? (I'm not succeeding in reproducing on macOS, which I what I have in front of me at the moment.)

Does orjson otherwise work on your machine? E.g.

pip install orjson 
python
import orjson

I'm trying to narrow down if it's orjson or Bazel's rules_python.

I assume we're all on the latest version of this tool, and if via workspace, could I ask you to check that you've added the addition transitive import lines (all 4 🙄) and that it's at the top, just so some other tool isn't bringing in an old version of rules_python?

cpsauer commented 7 months ago

I'm so sorry this has caused such a headache; we tried to switch to Bazel's new python infrastructure to not reply on system python and to be able to use packages, but they've had quite a few bugs.

To try to narrow things down a bit more, a couple commits to try: Here's with rules python downgraded (use this tool's commit https://github.com/hedronvision/bazel-compile-commands-extractor/commit/4d5671472a7272ea19dd61debf1e64d8aed27b41)? If that works, great, temporarily. And if not let's zip you back to https://github.com/hedronvision/bazel-compile-commands-extractor/commit/6d58fa6bf39f612304e55566fa628fd160b38177, just to make sure all was working before moving to rules python.

btalb commented 7 months ago

Thanks for having a look @cpsauer .

Extra information from my end (not sure how much is useful or not):

I've been away from Bazel for a while, and just jumping back in now, but isn't the whole hermetic Bazel promise that things like this (i.e. it worked before but doesn't now) can't happen? Are Bazel modules / the tie in to package managers weakening this?

Let me know if there's any more investigation I can do to help from my end.

ciarand commented 7 months ago

I assume we're all on the latest version of this tool, and if via workspace, could I ask you to check that you've added the addition transitive import lines (all 4 🙄) and that it's at the top, just so some other tool isn't bringing in an old version of rules_python?

We're using Bazel 7 + the bzlmod installation instructions + @btalb's patch on top of commit 09fee2fd2082b17baa9582ca2b2d63fe19caa294, so no transitive import lines. We haven't switched our rules_python stuff to using bzlmod yet, so this tool is the only one requesting the BCR version of rules_python.

$ bazel query 'deps(//:refresh_compile_commands)'
...
@@rules_python~0.29.0//python/config_settings:_is_python_3.11
@@rules_python~0.29.0//python/config_settings:_is_python_3.11_2
@@rules_python~0.29.0//python/config_settings:_is_python_3.11_3
@@rules_python~0.29.0//python/config_settings:_is_python_3.11_4
@@rules_python~0.29.0//python/config_settings:_is_python_3.11_5
@@rules_python~0.29.0//python/config_settings:_python_version_flag_equals_3.11
@@rules_python~0.29.0//python/config_settings:is_python_3.11
@@rules_python~0.29.0//python/config_settings:is_python_3.11.1
@@rules_python~0.29.0//python/config_settings:is_python_3.11.3
@@rules_python~0.29.0//python/config_settings:is_python_3.11.4
@@rules_python~0.29.0//python/config_settings:is_python_3.11.5
@@rules_python~0.29.0//python/config_settings:is_python_3.11.6
@@rules_python~0.29.0//python/config_settings:python_version
@@rules_python~0.29.0~pip~hedron_compile_commands_pip//orjson:pkg
@@rules_python~0.29.0~pip~hedron_compile_commands_pip_311_orjson//:_pkg
@@rules_python~0.29.0~pip~hedron_compile_commands_pip_311_orjson//:pkg
@@rules_python~0.29.0~pip~hedron_compile_commands_pip_311_orjson//:site-packages/__init__.py
@@rules_python~0.29.0~pip~hedron_compile_commands_pip_311_orjson//:site-packages/orjson-3.9.12.dist-info/INSTALLER
@@rules_python~0.29.0~pip~hedron_compile_commands_pip_311_orjson//:site-packages/orjson-3.9.12.dist-info/METADATA
@@rules_python~0.29.0~pip~hedron_compile_commands_pip_311_orjson//:site-packages/orjson-3.9.12.dist-info/WHEEL
@@rules_python~0.29.0~pip~hedron_compile_commands_pip_311_orjson//:site-packages/orjson-3.9.12.dist-info/license_files/LICENSE-APACHE
@@rules_python~0.29.0~pip~hedron_compile_commands_pip_311_orjson//:site-packages/orjson-3.9.12.dist-info/license_files/LICENSE-MIT
@@rules_python~0.29.0~pip~hedron_compile_commands_pip_311_orjson//:site-packages/orjson/__init__.py
@@rules_python~0.29.0~pip~hedron_compile_commands_pip_311_orjson//:site-packages/orjson/__init__.pyi
@@rules_python~0.29.0~pip~hedron_compile_commands_pip_311_orjson//:site-packages/orjson/orjson.cpython-311-x86_64-linux-gnu.so
@@rules_python~0.29.0~pip~hedron_compile_commands_pip_311_orjson//:site-packages/orjson/py.typed
cpsauer commented 7 months ago

Re hermetic preventing this: yes, it should, but rules_python is definitely causing a lot of issues. I think the move is to report these issues to rules_python, revert out usage of rules_python, and then return to rules_python when they've gotten things more usable. Sorry for all the headache here.

Could I ask for one more round of your help here @btalb? Would you be down to report to rules_python, linking this issue and tagging me? They might ask to verify that your env was working with Python 3.11, so maybe worth a quick double check there, too.

The revert, I'll work on...

cpsauer commented 7 months ago

Reverted rules_python in https://github.com/hedronvision/bazel-compile-commands-extractor/commit/0b821b7e4286aec887757461366f6eaaa0972cb9. Tracking restoration in https://github.com/hedronvision/bazel-compile-commands-extractor/issues/168.

That avoids this for now, but seriously, worth our working with them to get it fixed for future use; this'll be important in the future.

(Just holler if you want me to reopen.)

Thanks again for helping leave things better than you found them!

btalb commented 7 months ago

Not sure if this is the result you were looking for @cpsauer ; but I just tried to do a repro for posting to rules_python and it seemed to work:

ben@ben-meshify:~/tmp/rules_python$ bazel run //:hello_orjson
Starting local Bazel server and connecting to it...
INFO: Analyzed target //:hello_orjson (78 packages loaded, 2966 targets configured).
INFO: Found 1 target...
Target //:hello_orjson up-to-date:
  bazel-bin/hello_orjson
INFO: Elapsed time: 5.651s, Critical Path: 0.21s
INFO: 5 processes: 5 internal.
INFO: Build completed successfully, 5 total actions
INFO: Running command line: bazel-bin/hello_orjson
/home/ben/.cache/bazel/_bazel_ben/4ebdc7afb60c738c86a803f747cca1bd/execroot/_main/bazel-out/k8-fastbuild/bin/hello_orjson.runfiles/rules_python~0.29.0~pip~orjson_not_found_pip_311_orjson/site-packages/orjson/__init__.py

All the files:

ben@ben-meshify:~/tmp/rules_python$ ls -al
total 228
...
-rw-rw-r--  1 ben ben    172 Feb  5 06:12 BUILD.bazel
-rw-rw-r--  1 ben ben     38 Feb  5 06:10 hello_orjson.py
-rw-rw-r--  1 ben ben    486 Feb  5 06:05 MODULE.bazel
-rw-rw-r--  1 ben ben 189327 Feb  5 06:12 MODULE.bazel.lock
-rw-rw-r--  1 ben ben     15 Feb  5 06:05 requirements.txt
ben@ben-meshify:~/tmp/rules_python$ cat MODULE.bazel
module(name = "orjson_not_found")

bazel_dep(name = "rules_python", version = "0.29.0")
python = use_extension("@rules_python//python/extensions:python.bzl", "python")
python.toolchain(
    python_version = "3.11",
)
use_repo(python, "python_3_11")
pip = use_extension("@rules_python//python/extensions:pip.bzl", "pip")
pip.parse(
    hub_name = "orjson_not_found_pip",
    python_version = "3.11",
    requirements_lock = "//:requirements.txt",
)
use_repo(pip, "orjson_not_found_pip")
load("@orjson_not_found_pip//:requirements.bzl", "requirement")

py_binary(
    name = "hello_orjson",
    srcs = ["hello_orjson.py"],
    deps = [requirement("orjson")]
)
ben@ben-meshify:~/tmp/rules_python$ cat requirements.txt 
orjson==3.9.12
ben@ben-meshify:~/tmp/rules_python$ cat hello_orjson.py 
import orjson

print(orjson.__file__)
cpsauer commented 6 months ago

Super appreciate your investigating--that definitely narrows things down considerably. Any idea what might be different between the cases? Perhaps the failure is only if rules_python is a transitive dependency? Would you be down to point another repo into this (w/ e.g. local_override) and see if the issue reproduces--and otherwise try to see if you can figure out what's up?

btalb commented 6 months ago

I'll have a go at this sometime this week hopefully

cpsauer commented 6 months ago

Thanks so much @btalb. Quite valuable for helping Bazel Python work out its quirks more generally--but also so this tool (and others) can move forward for y'all in the future.

btalb commented 6 months ago

I've had another dig with this and haven't found much unfortunately:

Observations:

Unless there's something very targeted to try, I'm going to have to pause on this sorry.

cpsauer commented 6 months ago

Baffling! I really appreciate your giving it a go. Super weird because that should be a parallel invocation, right? We weren't doing anything fancy; just a raw py_binary with a single source and dependency...doesn't get much simpler of a use case than that. Edit: unless, shoot, is it that they fail with generated sources? Would you be down to quickly genrule one and see if you can trigger it that way? That is, something like the following in your BUILD:

genrule(
      name = "generate_python",
      outs = [hello_orjson.py],
      cmd = """
cat > $(OUTS) <<EOF
import orjson
EOF
""",
)
py_binary(
      name = "hello_orjson",
      srcs = [":generate_python"],
      deps = [requirement("orjson")]
)

If that does not reproduce, and you do find yourself willing to experiment a little more, I think the next step would be to grab, e.g., https://github.com/hedronvision/bazel-compile-commands-extractor/commit/c4918fabf92f9b8176e6475e5abc611f75e63d70 and cut it down until the issue goes away. That is repro, delete the WORKSPACE, repro, delete almost all the python except the import, repro, etc.