facebook / pyre-check

Performant type-checking for python.
https://pyre-check.org/
MIT License
6.86k stars 437 forks source link

"Worker interrupted with signal: sigill" errors when running check. #233

Open pablo-meier opened 4 years ago

pablo-meier commented 4 years ago

Hi friends!

I'm trying out pyre-check in a medium sized Python codebase (37kloc source, 34kloc tests) but the process is crashing rather opaquely on pyre --preserve-pythonpath check.

The configuration is pretty basic:

{
  "binary": "<path_to_virtualenv>/bin/pyre.bin",
  "source_directories": [
    "."
  ],
  "taint_models_path": "<path_to_virtualenv>/lib/pyre_check/taint/",
  "typeshed": "<path_to_virtualenv>/lib/pyre_check/typeshed/"
}

And running it with --debug is producing the following log lines:

2020-02-04 14:23:48,499 DEBUG Registering process with pid 78480 in pid file `<code directory>/.pyre/pid_files/check-78480.pid`
2020-02-04 14:23:49,125 PERFORMANCE Module tracker built: 0.546048s
2020-02-04 14:23:49,228 INFO Parsing 6636 stubs and sources...
2020-02-04 14:23:49,739 ERROR Could not parse file at 654:28-654:28
          l6 = lambda x, y, *, k=20: x+y+k
                             ^
2020-02-04 14:23:50,085 ERROR Could not parse file at 2:0-2:0
  € = 2
  ^
2020-02-04 14:23:50,607 ERROR Could not parse file at 7:12-7:12
              ä = 1
             ^
2020-02-04 14:23:51,133 ERROR Could not parse file at 814:32-814:32
              self.assertEqual(p.têšt(42), 42)
                                 ^
2020-02-04 14:23:51,866 ERROR Could not parse file at 206:20-206:20
              print << sys.stderr
                     ^
2020-02-04 14:23:53,282 ERROR Could not parse file at 61:14-61:14
          x = 0l
               ^
2020-02-04 14:23:53,481 ERROR Could not parse file at 156:17-156:17
          lambda *, k1=unittest: None
                  ^
2020-02-04 14:23:54,118 ERROR Could not parse file at 371:28-371:28
          l6 = lambda x, y, *, k=20: x+y+k
                             ^
2020-02-04 14:23:57,901 WARNING Could not parse 8 files due to syntax errors!
2020-02-04 14:23:57,902 PERFORMANCE Sources parsed: 8.777836s
2020-02-04 14:23:58,003 INFO Building type environment...
2020-02-04 14:23:59,117 INFO Updating is from empty stub result Environment
2020-02-04 14:23:59,118 INFO Updating Alias Environment
2020-02-04 14:23:59,221 INFO Updating Edges Environment
2020-02-04 14:24:03,159 INFO Updating Undecorated functions Environment
2020-02-04 14:24:03,160 INFO Updating Class metadata Environment
2020-02-04 14:24:03,160 INFO Updating parse annotation Environment
2020-02-04 14:24:03,160 INFO Updating attributes Environment
2020-02-04 14:24:03,160 INFO Updating Global Environment
2020-02-04 14:24:03,261 PERFORMANCE Full environment built: 5.259011s
2020-02-04 14:24:03,583 MEMORY Shared memory size (size: 193)
2020-02-04 14:24:04,539 INFO Checking 4722 functions...
2020-02-04 14:24:06,639 INFO Processed 296 of 4722 functions
2020-02-04 14:24:06,798 INFO Processed 592 of 4722 functions
2020-02-04 14:24:06,798 INFO Processed 888 of 4722 functions
2020-02-04 14:24:06,875 INFO Processed 1184 of 4722 functions
2020-02-04 14:24:06,875 INFO Processed 1480 of 4722 functions
2020-02-04 14:24:06,875 INFO Processed 1776 of 4722 functions
2020-02-04 14:24:06,967 INFO Processed 2072 of 4722 functions
2020-02-04 14:24:07,070 INFO Processed 2368 of 4722 functions
2020-02-04 14:24:08,404 DEBUG Removing pid file: `/Users/pablo/ycard/suma-core/.pyre/pid_files/check-78480.pid`
2020-02-04 14:24:08,897 ERROR Client exited with error code 1:
Worker interrupted with signal: sigill
Worker interrupted with signal: sigpipe
Worker interrupted with signal: sigpipe
Worker interrupted with signal: sigill
Worker interrupted with signal: sigpipe
Worker interrupted with signal: sigpipe
Worker interrupted with signal: sigpipe
Worker interrupted with signal: sigpipe

The source parsing errors come from --preserve-pythonpath dependencies: we're using a fair number of them (~70?) but the worker kills persists even with a pyre check on our source directories.

I'm not in a position to distribute or share this codebase, so I unfortunately can't share a reproducible build 😭So I guess I'm asking:

Thanks, this is an exciting project 😄

gbleaney commented 4 years ago

@pablo-meier what version are you running?

Some things you can start by doing:

  1. Can you try running pyre-check without --preserve-pythonpath? Does it still fail?

    1. If still failing: Is your virtual env inside your repo, or linked to from in your repo? If so, try excluding it from the analysis like this (or just unlink temporarily) to see if we can isolate the problem to one of your dependencies.
    2. If still failing: Can you try minimizing your .pyre_configuration to just { "source_directories": [ "." ] } and try running pyre check
  2. Look at top or some other resource monitor when running. Are you maxing out your system's memory? I've seen pyre-check be mysteriously killed by the OS when using too much memory

pablo-meier commented 4 years ago

what version are you running?

Looks like 0.0.41? I believe I got it from pip3 install pyre-check, though I can go to a previous version if this is unstable (I see 0.0.38 is tagged as "Latest Release", and a 0.0.42).

 pyre --version
No watchman binary found.
To enable pyre incremental, you can install watchman: https://facebook.github.io/watchman/docs/install.html
Defaulting to non-incremental check.
Binary version: No version set
Client version: 0.0.41

Can you try running pyre-check without --preserve-pythonpath? Does it still fail?

It does 😢

    • If still failing: Is your virtual env inside your repo, or linked to from in your repo? If so, try excluding it from the analysis like this (or just unlink temporarily) to see if we can isolate the problem to one of your dependencies.

This is a great suggestion, I'll spend some time on it. That said…

    • If still failing: Can you try minimizing your .pyre_configuration to just { "source_directories": [ "." ] } and try running pyre check

I ran this immediately, and it still fails even with {"source_directories": ["."]}, so I suspect it's not specifically a dependency.

Look at top or some other resource monitor when running. Are you maxing out your system's memory? I've seen pyre-check be mysteriously killed by the OS when using too much memory

I looked at Activity Monitor and htop (on a Mac), didn't look like the system was running out of memory, there were gigabytes available and each pyre worker was taking about ~140MB, at most.

gbleaney commented 4 years ago

it still fails even with {"source_directories": ["."]}, so I suspect it's not specifically a dependency.

Pyre will recurse through the current directory, so even minimizing .pyre_configuration to {"source_directories": ["."]} might not protect you from your venv if it's linked to from your repo. Let me know how excluding/unlinking it turns out.

If it does turn out to be something in your venv, could you try identifying the dependency/giving us a list of dependencies so we can test ourselves?

I can go to a previous version if this is unstable (I see 0.0.38 is tagged as "Latest Release", and a 0.0.42).

No need to change versions yet. We made a change a while back to output more error info in some scenarios, but that should be in 0.41.

Few more thoughts:

  1. Some additional flags you can toss in, though I'm honestly not sure if they'll give any more debug info: --verbose and --noninteractive.
  2. If it turns out excluding your venv doesn't help, you could try using the exclude syntax to cut out chunks of your code from the analysis, to eventually get to the problematic file
  3. What version of python are you running?
grievejia commented 4 years ago

The --preserver-pythonpath option is in a broken and unmaintained state for a while: It blindly pulls in every source file in the Python distribution (including test cases that are intentionally written as invalid Python), which is probably what causes the parse errors there. Removing this option is on my todo-list.

That said, the worker crash seems to be unrelated to that option. Unfortunately, --verbose or --noninteractive doesn't add much to the log compared with --debug. But I'd follow @gbleaney's other suggestion of using exclude (ignore_all_errors should also work in this case) to pinpoint the problematic file.

pablo-meier commented 4 years ago

Pyre will recurse through the current directory, so even minimizing .pyre_configuration to {"source_directories": ["."]} might not protect you from your venv if it's linked to from your repo. Let me know how excluding/unlinking it turns out.

Will do! I had my venv in a different directory. Will still run my sources and dependencies with exclusions. If I find any that break it (or of my source), I'll post it here.

Thanks for the prompt responses! 🙌

pablo-meier commented 4 years ago

Okay! I've found some interesting cases…

I ran through my own sources (no dependencies, clean environment) and winnowed things down with exclude until I found two source files. Within those source files, I commented out blocks/methods until I found culpable ones causing the crash. While I've narrowed it down tremendously, I'm not 1000% sure exactly where the bug is or what's triggering it: when I isolated these sources into their own projects, the crash failed to reproduce. I'll keep trying, but in the meantime, I'll communicate my progress, and maybe it'll ring a bell?

At a high-level, this codebase is a Flask app, and has a number of custom Flask extensions that follow a format like the example "custom extension." The two crashers are custom extensions for wrapping third-party services.

Both subclass a base extension object, which doesn't look super relevant, but including it for completeness:

from flask import current_app

class BaseExtension:
    app = None

    def __init__(self, app=None):
        self.app = app
        if app is not None:
            self.init_app(app)

    def init_app(self, app):
        raise NotImplementedError

    def _get_app(self):
        if not current_app and self.app:
            return self.app
        return current_app

The smaller custom extension has a form like this:

import hooli
from .base import BaseExtension

class HooliExtension(BaseExtension):
    def init_app(self, app):
        app.extensions["hooli"] = self
        self.hooli = hooli.HooliAPIClient(app.config["HOOLI_API_KEY"])

    def __getattr__(self, name):
        return getattr(self.hooli, name)

Where it tends to blow up is __getattr__: if I comment this method out, pyre check completes, if I don't, it crashes. The other module (not included) also allows pyre check to terminate normally when you comment out __getattr__. Oddly, other custom extensions have __getattr__ methods that don't crash.

Given that:

I'll look into how the extension code is being used by the surrounding code, and uncomment __getattr__ + "exclude" the modules that used it. If pyre terminates normally, the crash might be related to the combination of dynamic getattr and use patterns rather than the source in isolation. It might be related to the weird introspective nature of __getattr__.

Thanks for the pointers, I'll hopefully get this into an actionable bug report soon 😄

gbleaney commented 4 years ago

Hey @pablo-meier, any luck narrowing it down to a repro? I'd love to see if we can get this resolved.

pablo-meier commented 4 years ago

@gbleaney Hi!

So I worked more with the codebase in question, and it's… tangled! There's a lot going on and the normal "binary search/try to rebuild" didn't look like it was going to yield fruit very quickly. There was a bit of delay to update this because I also went on vacation this weekend, and I had to do some work on the project outside of getting it typechecked, which is more of a personal initiative 😛 . Regarding finding the crasher, I think it'd be more productive for me to build pyre-check locally with printlines and see where/how it's crashing specifically.

Unfortunately, I'm having some trouble getting it to build: when I create a new opam switch and install dependencies with other opam commands, I'm still getting errors like this:

File "timer.mli", line 10, characters 17-48:
10 | val calibrator : Time_stamp_counter.Calibrator.t
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error: Unbound module Time_stamp_counter
File "pyrePath.ml", line 95, characters 53-74:
95 | let is_directory path = absolute path |> fun path -> Sys.is_directory path = `Yes
                                                          ^^^^^^^^^^^^^^^^^^^^^
Error: This expression has type [ `No | `Unknown | `Yes ]
       but an expression was expected of type int
File "file.ml", lines 58-66, characters 23-3:
58 | .......................struct
59 |   type nonrec t = t
60 |
61 |   let compare = compare
62 |
63 |   let sexp_of_t = sexp_of_t
64 |
65 |   let t_of_sexp = t_of_sexp
66 | end.
Error: Signature mismatch:
       ...
       Values do not match:
         val compare : int -> int -> int
       is not included in
         val compare : t -> t -> Base.int
       File "src/set_intf.ml", line 29, characters 2-35: Expected declaration
       File "file.ml", line 61, characters 6-13: Actual declaration
File "log.ml", line 156, characters 6-13:
156 |       st_kind = Unix.S_LNK || st_kind = Unix.S_REG
            ^^^^^^^
Error: This expression has type Unix.file_kind
       but an expression was expected of type int
File "parser/generator.mly", line 1644, characters 18-22:
Error: This expression has type 'a option
       but an expression was expected of type int

These look like I've got the wrong version of Core in the switch (the type errors seem to be around Core's functions and types), and I looked around Core's docs for previous versions with different type signatures, but to no avail. I'm still figuring things out with OCaml and would love to get better at it (the language features are some of my favorite in any language in use) and would appreciate any help you might throw my way, if you can spare it 😄

Here's the result of opam list -i --columns=name,version:

# Packages matching: installed
# Name                  # Version
base                    v0.13.0
base-bigarray           base
base-bytes              base
base-threads            base
base-unix               base
base64                  3.3.0
base_bigstring          v0.13.0
base_quickcheck         v0.13.0
bin_prot                v0.13.0
biniou                  1.2.1
conf-m4                 1
core                    v0.13.0
core_kernel             v0.13.0
cppo                    1.6.6
dune                    2.1.3
dune-configurator       2.1.3
dune-private-libs       2.1.3
easy-format             1.3.2
fieldslib               v0.13.0
jane-street-headers     v0.13.0
jst-config              v0.13.0
menhir                  20200123
menhirLib               20200123
menhirSdk               20200123
num                     1.3
ocaml                   4.09.0
ocaml-base-compiler     4.09.0
ocaml-compiler-libs     v0.12.1
ocaml-config            1
ocaml-migrate-parsetree 1.5.0
ocamlfind               1.8.1
octavius                1.2.2
ounit                   2.2.2
ounit2                  2.2.2
parsexp                 v0.13.0
ppx_assert              v0.13.0
ppx_base                v0.13.0
ppx_bench               v0.13.0
ppx_bin_prot            v0.13.0
ppx_cold                v0.13.0
ppx_compare             v0.13.0
ppx_custom_printf       v0.13.0
ppx_derivers            1.2.1
ppx_deriving            4.4
ppx_deriving_yojson     3.5.1
ppx_enumerate           v0.13.0
ppx_expect              v0.13.0
ppx_fail                v0.13.0
ppx_fields_conv         v0.13.0
ppx_hash                v0.13.0
ppx_here                v0.13.0
ppx_inline_test         v0.13.0
ppx_jane                v0.13.0
ppx_js_style            v0.13.0
ppx_let                 v0.13.0
ppx_module_timer        v0.13.0
ppx_optcomp             v0.13.0
ppx_optional            v0.13.0
ppx_pipebang            v0.13.0
ppx_sexp_conv           v0.13.0
ppx_sexp_message        v0.13.0
ppx_sexp_value          v0.13.0
ppx_stable              v0.13.0
ppx_tools               5.3+4.08.0
ppx_typerep_conv        v0.13.0
ppx_variants_conv       v0.13.0
ppxfind                 1.3
ppxlib                  0.12.0
re                      1.9.0
result                  1.4
seq                     base
sexplib                 v0.13.0
sexplib0                v0.13.0
spawn                   v0.13.0
splittable_random       v0.13.0
stdio                   v0.13.0
stdlib-shims            0.1.0
time_now                v0.13.0
typerep                 v0.13.0
variantslib             v0.13.0
yojson                  1.7.0

Happy Monday!

grievejia commented 4 years ago

I think it'd be more productive for me to build pyre-check locally with printlines and see where/how it's crashing specifically.

Oh if you know OCaml this would indeed be the fastest way to get the answer :)

These look like I've got the wrong version of Core in the switch

Yes I think that's a correct observation.

Our onboarding process has been janky for historical reasons. Sorry for the bad experience!

Instead of going through opam lock for package pinning, we rely on a shell script to do that: if you take a look at scripts/setup.sh, it has an explicit dependency list with set versions. You can try to install the versions listed there manually, or even better, just running that script on a clean switch for all the setups.

pablo-meier commented 4 years ago

Just FYI:

Thanks for the guidance and patience.

gbleaney commented 4 years ago

@pablo-meier did you ever manage to get to the bottom of this? I really want to make sure we get this fixed.

pablo-meier commented 4 years ago

I wasn't able to pursue it to its end, unfortunately. I'll give it another crack on the current versions of both pyre and the codebase that produced this, thanks for the reminder 😄