davidhalter / jedi

Awesome autocompletion, static analysis and refactoring library for python
http://jedi.readthedocs.io
Other
5.73k stars 503 forks source link

Script.infer() unexpectedly returns empty list #1998

Closed mamqek closed 1 month ago

mamqek commented 1 month ago

Hello,

I am working on a refactoring tool and want to use Jedi as a way to extract information about types. Right now I am checking import statements whether they import module or a class.

        script = jedi.Script(ast.unparse(node), path=current_file_path)
        result = script.infer(1, node.names[0].col_offset)
        jedi.set_debug_function()

This is code I am using to check it. "node" in this block is an import statement starting with "from" like

  from ..Animal import Animal

The method works for this import statement, however in the file Dog.py I am importing CowRenamed class like

 from .CowRenamed import CowRenamed

and it returns me an empty list. Debug returns this. File structure looks like this image

Also it works for import in Cat.py, which has the same structure as the import that doesnt work.

 from .Feline import Feline

The issue repeats in other files, I cant identify what is wrong with it. Thank you for reading, hopefully I am missing something and I can still use Jedi for this purpose.

davidhalter commented 1 month ago

It's hard to say what the issue is. I would need a good reproduction zip, if you want me to help.

There is a small chance that relative namespace imports are bugged within Jedi (because not many people are using it), but I would doubt it, there are a lot of tests and real users that haven't reported anything. My guess would be that something is wrong with the paths you are providing to Jedi, but I'm not sure. There's another chance that something small is wrong with relative imports and Windows. But I'm not sure at all. You would help me a lot if you wrote a much smaller test that would reproduce this behavior and put it within a zip file, or even better: You create a pull request and the tests will catch the issue if it is really bugged - there's a Windows test run there as well.

mamqek commented 1 month ago

I am not familiar with how I can do a pull request and then test it. I could do it if you give me some steps. Anyway here is the zip with reproduction. jedi_reproduction.zip Stat analyzer.py and CTRL+F output in terminal for "No result".

Also I am not so much familiar with general Python practices. Is it not common to use relative paths for importing as they are in this example?

davidhalter commented 1 month ago

I could do it if you give me some steps.

Let's try then :)

There is however a step I would recommend you to do first:

Your reproduction case is still way to big. You are still using path walking and ast.NodeVisitor and other stuff. Your reproduction should look like this:

print(jedi.Script(path=<some path>).infer(<some-line>, <some-column>))

Otherwise the issue could still be your code, which is what we want to try to avoid here. This is something I would recommend you to do anyway if you want bugs fixed. It's really important that maintainers understand where the bug is. Once you are at this point: Is there still a bug?

There is a small chance of course that Jedi's caching affects the results (very low chance), so in that case you would have to use two calls like the above in a row.

Also I am not so much familiar with general Python practices. Is it not common to use relative paths for importing as they are in this example?

PEP8 recommends to use absolute imports, but relative imports should obviously still work: https://peps.python.org/pep-0008/#imports

mamqek commented 1 month ago

Indeed if I have just this code for the first occurrence of empty result, it does work.

import jedi

current_file_path = 'after\\org\\animals\\Dog.py'

script = jedi.Script(path=current_file_path)
result = script.infer(2, 24)
print(result)

I experimented a little and I don't know why but if I don't pass the exact line to Script() then it works.

    def visit_ImportFrom(self, node):
        print(f"Importing from '{node.module}' at level {node.level}")
        global current_file_path
        script = jedi.Script(path=current_file_path)
        result = script.infer(node.names[0].lineno, node.names[0].col_offset)
        jedi.set_debug_function()
        if not result:
            print("No result")
        elif result[0].type == 'module':
            print("Module")
        else:
            print("Not module")

I wonder though, does Jedi load the whole file then? I am looking for the most low cost solution, but if it does, could I somehow load all the files in the directory before execution, so I could query multiple times in different functions without excessive loading.

davidhalter commented 1 month ago

FYI: You can write current_file_path = 'after\\org\\animals\\Dog.py' as current_file_path = r'after\org\animals\Dog.py'. We called these "raw string literals".

Indeed if I have just this code for the first occurrence of empty result, it does work.

This means probably that your code is buggy and you probably pass the wrong params somewhere. If you want to check this, just print a few Script().infer() invocations with the params you are currently using and copy them into a Python script, run them and check where the issue lies.

I experimented a little and I don't know why but if I don't pass the exact line to Script() then it works.

I'm not sure what the difference would be.

mamqek commented 1 month ago

I'm not sure what the difference would be.

Same, but it works now.

Do you have any comments about what would be the most efficient way to use infer on the same file multiple times? Would it benefit me If I would save script at the start of each file analysis instead of creating it each time I need to use infer() or something else?

davidhalter commented 1 month ago

Would it benefit me If I would save script at the start of each file analysis instead of creating it each time I need to use infer() or something else?

Yes it will change the caching and might be faster.

I'm closing this issue, because I doubt there's a bug in Jedi knowing what you sent me. However, feel free to keep this discussion going. I will read and reply.

mamqek commented 1 month ago

Hello David,

Thank you for your willingness to help and spent your time on these questions.

I want to make sure I am using the most efficient approach. When I have a simple initialization of variable like

numVar = 2

If I want to know the type of the variable (in this case int, which is saved under attribute name) would it be faster to infer on the value it is being assigned to itself or on the variable (left or right part)? I would rather do so on the left part, but again efficiency first. Moreover, do those differences even matter for efficiency? Or the difference in efficiency is neglectable most of the time as +/- same operations happen on the backstage?

mamqek commented 1 month ago

Also, when I have operatin on dictionary like

   dict['value'] = 2

Can I infer on the whole element on the left dict['value']? if I just infer on it, returned type is dict, if I infer on "value" returned type is str. But I want to infer on the element inside the dictionary dict['value'], which is clearly int, as it i being assigned to one.

davidhalter commented 1 month ago

I think this is where I have to leave you on your own. This is just where you have to experiment and try and error around things. I'm not sure myself how Jedi behaves in all of these cases.