Closed Strum355 closed 3 years ago
Are you able to debug the problem and localize the issue to a particular statement that results in the internal assertion? One approach is to binary search the file, adding or removing code to see what makes the problem appear or disappear.
Ive been able to narrow down the issue to a very tiny snippet:
ini_config_types = {
'strict_optional_whitelist': lambda s: s.split(),
}
Just this snippet in a file works fine with the vscode extension, giving the expected Unknown
types and type compositions
Consistent with what you said above, I'm not able to repro the problem using pyright in its normal usage.
This will require more debugging. Have you figured out how to launch the CLI with a debugger attached? Let me know if you need help getting that to work.
Do you have a stack trace for the failed assertion? Are you seeing any other logged output that might provide additional clues? For example, I'm wondering if it's unable to find the stdlib typeshed files when you instantiate the type evaluator. Pyright is highly dependent on the core typeshed definitions (e.g. builtins.pyi, typing.pyi) and will likely crash if it cannot find them.
Stacktrace is at the bottom of the original issue in an collapsible section, using sourcemaps so its the original typescript lines :slightly_smiling_face: Not getting additional logs but I may have them suppressed. Stdlib typeshed files resolve fine, I have the typeshed folder set to the one bundled with the pyright repo and have observed it resolving types from there correctly before.
I have attached with a debugger before, but not for this particular bug. A vscode launch config for it is included in my fork. I can try see what I can glean but Im unsure of how much I can do with my limited knowledge of the codebase.
Ah, I missed the stack crawl. Thanks for pointing that out.
One thing that jumps out here is that you've created a tree walker that is calling back in to the hover provider, which is a form of reentrancy that these components were not designed for. You could try commenting out the code in your tree walker and see if that eliminates the crash.
Perhaps you'd be willing to explain in more detail what you're trying to do here. I might be able to suggest a better approach. Is your goal to output signatures and/or docs for classes, functions and methods in these files?
Is your goal to output signatures and/or docs for classes, functions and methods in these files?
Thats essentially it yes, as well as outputting reference and definition ranges. Specifically in the LSIF format. This probably seems like a very hack-ish solution (and it probably is! First thing I got working with the internal APIs) so any pointers towards a better/more performant solution would be much appreciated
Have been able to reproduce the issue without the Tree walker, attaching the stacktrace here. Link to entrypoint https://github.com/Strum355/pyright/blob/01f15ce31f8f6cd9a054e21fc48cb762923ae25d/packages/lsif-pyright/src/indexer.ts#L62
I dug into the problem and discovered the root cause. The problem is that in "indexing" mode, the binder is being executed with the indexGenerationMode
flag that tells it to bind only module-level symbols, not symbols within classes, functions or lambdas. This is useful if you're indexing for the purpose of generating auto-import suggestions. If you're indexing for deeper analysis (as I think you are for your use case), you should specify false
for this parameter.
Because the binder didn't perform a complete binding step, it didn't generate some data that the type evaluator was later assuming would be present. That's what triggered the assertion that you were seeing.
I've removed the assertion and made the type evaluator more robust so it can handle this case.
@heejaechang, this bug could also impact the indexing code paths in pylance, so I'm glad we found this before we enabled the indexer by default.
This is included in pyright 1.1.145, which I just published.
indexGenerationMode is only on for third-party libraries where we only use trees/symbol tables for indexing and never reuse them for something else. For user files (files under workspace that meet "include" pattern), we don't enable it and the tree and decl table might be re-used for other features. in other words, it should have all the code flow information intact.
so this issue shouldn't happen for pylance.
...
that being said, it is awesome that @Strum355 is working on creating LSIF index for pythons. I worked with the team who created LSIF format to provide them prototype implementation of LSIF over C# to provide feedbacks. Since LSIF includes all LSP feature's data (ex, info over all bindable names) in outputs, it requires non top level symbols. index generation mode will create only top level symbols (and only parse top level nodes and create only top level decl table) so you should never use "indexGenerationMode" for LSIF.
...
also, by looking at the callstack shared at the beginning, it looks like you are calling hover provider inside of callback for indexWorkspace. I believe that can cause more parsing/binding to happen due to type evaluation (hover provider) of imported symbols in the middle of indexing which also causes parsing and binding as well (such as parsing import * files). My gut feeling is it should be safe, but we never tested it for such usage cases, so I think it would be safer if you just collect all the decls from the indexer and post-process them.
...
also, if you want to make sure that you don't share parse tree/decl tables created from indexer but just use indices, you can drop all existing tree/decl tables (ex, https://github.com/Strum355/pyright/blob/main/packages/pyright-internal/src/analyzer/program.ts#L507)
and use program to re-do all those things automatically for you (such as program.getHoverInformation with filename and position) assuming you already have program since you called indexWorkspace off the program.
and this can be used to cache so you don't re-index the whole world every time. (which was one of the issue of initial LSIF proposal)
My gut feeling is it should be safe, but we never tested it for such usage cases, so I think it would be safer if you just collect all the decls from the indexer and post-process them.
I had tried this approach but hit the same issue as originally filed. Disabling indexGenerationMode
(and rebasing from main) solved the issue, so theres a step forward, thanks! :tada:
Most notably remaining now is the issue of performance. Im chalking this up to my naive approach (given how fast invoking pyright cli is compared to the LSIF indexer if written so far), but I will open a Discussions thread where any interested parties can chip in :slightly_smiling_face:
Describe the bug In a fork of pyright (to make use of internal APIs), I use
Program.indexWorkspace
method to index an entire workspace in a batch-esque style https://github.com/Strum355/pyright/blob/a1c9990dbdbdd10aae482fcbc26e124a558fd032/packages/lsif-pyright/src/indexer.ts#L27. This causes the exception in the title to be thrown. Stacktrace can be found below.While I understand pyright does not have a public API yet for a reason, we are evaluating pyright as a backend for semantic indexing of Python projects, and there may be a few roadbumps along the way we hope to get some assistance with :slightly_smiling_face:
To Reproduce
npm run webpack && node dist/lsif-pyright.js
Expected behavior using
Program.indexWorkspace
would work just as well, if not better, than via the language server.Screenshots or Code https://github.com/python/mypy/blob/master/mypy/config_parser.py#L107
VS Code extension or command-line Custom CLI at https://github.com/Strum355/pyright/. Unfortunately the exception is not thrown with 1.1.144 of the vscode pyright extension with the same mypy file.
Additional context
Output from printing expression node
Console.log invoked [here](https://github.com/Strum355/pyright/blob/main/packages/pyright-internal/src/analyzer/typeEvaluator.ts#L16321) ```js { start: 3594, length: 5, nodeType: 38, id: 16597, token: { start: 3594, length: 5, type: 7, value: 'split', comments: undefined }, value: 'split', parent: { start: 3592, length: 7, nodeType: 35, id: 16598, leftExpression: { start: 3592, length: 1, nodeType: 38, id: 16596, token: [Object], value: 's', parent: [Circular *1] }, memberName: [Circular *2], parent: { start: 3592, length: 9, nodeType: 9, id: 16599, leftExpression: [Circular *1], arguments: [], parent: [Object] } } } ```Stacktrace
``` /home/noah/Sourcegraph/lsif-pyright/packages/pyright-internal/src/analyzer/typeEvaluator.ts:16322 assert(codeFlowExpressions !== undefined); ^ Error: Debug Failure. False expression. at getFlowTypeOfReference (/home/noah/Sourcegraph/lsif-pyright/packages/pyright-internal/src/analyzer/typeEvaluator.ts:16322:9) at getTypeFromMemberAccess (/home/noah/Sourcegraph/lsif-pyright/packages/pyright-internal/src/analyzer/typeEvaluator.ts:4252:40) at getTypeOfExpressionInternal (/home/noah/Sourcegraph/lsif-pyright/packages/pyright-internal/src/analyzer/typeEvaluator.ts:1008:30) at getTypeFromCall (/home/noah/Sourcegraph/lsif-pyright/packages/pyright-internal/src/analyzer/typeEvaluator.ts:6079:32) at getTypeOfExpressionInternal (/home/noah/Sourcegraph/lsif-pyright/packages/pyright-internal/src/analyzer/typeEvaluator.ts:1035:34) at getTypeFromLambda (/home/noah/Sourcegraph/lsif-pyright/packages/pyright-internal/src/analyzer/typeEvaluator.ts:11016:47) at getTypeOfExpressionInternal (/home/noah/Sourcegraph/lsif-pyright/packages/pyright-internal/src/analyzer/typeEvaluator.ts:1201:30) at /home/noah/Sourcegraph/lsif-pyright/packages/pyright-internal/src/analyzer/typeEvaluator.ts:10602:39 at Array.forEach (