idapython / src

IDAPython project for Hex-Ray's IDA Pro
http://www.hex-rays.com/
Other
1.42k stars 287 forks source link

Don't replace __file__ with an empty string on built-in modules #23

Closed marc-etienne closed 5 years ago

marc-etienne commented 5 years ago

Python (and the inspect module in particular) expects __file__ to be None (or absent) when a module is built-in and has no source or bytecode file.

See eset/ipyida#25

aundro commented 5 years ago

@marc-etienne before I accept this PR (which, in itself, looks fine), I'd like to be able to reproduce this issue, so that it gets tested.

Currently, I have had no luck doing so, and what bothers me the most is that I have no idea why.

Can you:

Thanks!

marc-etienne commented 5 years ago

Absolutely. Meanwhile, can you tell me what Python version you are testing on? I'm using 3.7.3 here.

marc-etienne commented 5 years ago

This is from the IDA console itself with no files open:

#### __file__ for /Users/marc-etienne/.idapro/idapythonrc.py: *** MISSING *** (g['__name__']='*** MISSING ***')
#### g['__file__'] restored to: ''
#### __file__ for /Users/marc-etienne/.idapro/plugins/ipyida.py: *** MISSING *** (g['__name__']='*** MISSING ***')
#### g['__file__'] restored to: ''
Python>import inspect;inspect.stack()
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/inspect.py", line 1513, in stack
    return getouterframes(sys._getframe(1), context)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/inspect.py", line 1490, in getouterframes
    frameinfo = (frame,) + getframeinfo(frame, context)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/inspect.py", line 1460, in getframeinfo
    filename = getsourcefile(frame) or getfile(frame)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/inspect.py", line 696, in getsourcefile
    if getattr(getmodule(object, filename), '__loader__', None) is not None:
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/inspect.py", line 739, in getmodule
    f = getabsfile(module)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/inspect.py", line 708, in getabsfile
    _filename = getsourcefile(object) or getfile(object)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/inspect.py", line 684, in getsourcefile
    filename = getfile(object)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/inspect.py", line 647, in getfile
    raise TypeError('{!r} is a built-in module'.format(object))
TypeError: <module '__main__' (built-in)> is a built-in module
Python>import inspect;inspect.stack()
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/inspect.py", line 1513, in stack
    return getouterframes(sys._getframe(1), context)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/inspect.py", line 1490, in getouterframes
    frameinfo = (frame,) + getframeinfo(frame, context)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/inspect.py", line 1460, in getframeinfo
    filename = getsourcefile(frame) or getfile(frame)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/inspect.py", line 696, in getsourcefile
    if getattr(getmodule(object, filename), '__loader__', None) is not None:
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/inspect.py", line 739, in getmodule
    f = getabsfile(module)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/inspect.py", line 708, in getabsfile
    _filename = getsourcefile(object) or getfile(object)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/inspect.py", line 684, in getsourcefile
    filename = getfile(object)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/inspect.py", line 647, in getfile
    raise TypeError('{!r} is a built-in module'.format(object))
TypeError: <module '__plugins__ipyida' from ''> is a built-in module
Python>import inspect;inspect.stack()
Python>import inspect;print(inspect.stack())
[FrameInfo(frame=<frame at 0x11646da08, file '<string>', line 1, code <module>>, filename='<string>', lineno=1, function='<module>', code_context=None, index=None)]
aundro commented 5 years ago

Welp, it's just something that changed between 3.5.3 and 3.7.4: I just built 3.7.4 for my linux box, idapyswitch'd to it, and I can reproduce the issue.

One last thing: you wrote: «Python (and the inspect module in particular) expects __file__ to be None». Did you guess that, or is this information available somewhere?

aundro commented 5 years ago

I found this:

The loader must set several attributes on the module. __name__ is to be set to the name of the module. __file__ is to be the “path” to the file unless the module is built-in (and thus listed in sys.builtin_module_names) in which case the attribute is not set

…but it has the following quirks:

aundro commented 5 years ago

About this:

'__main__' is not in sys.builtin_module_names:

aundro@flatiron:~$ python3
Python 3.5.3 (default, Sep 27 2018, 17:25:39) 
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> "__main__" in sys.builtin_module_names
False
>>> __name__
'__main__'
>>> sys.modules[__name__]
<module '__main__' (built-in)>

I mean…

marc-etienne commented 5 years ago

Isn't the __main__ module/namespace artificially created by get_module_globals()? (S_MAIN == "__main__")?

marc-etienne commented 5 years ago

it's for Python 2.7

If found this for Python 3.7: https://docs.python.org/3.7/reference/import.html?highlight=__file__#__file__

'main' is not in sys.builtin_module_names (how come?)

According to this, it seems it for modules built into Python. (Which is confusing because it seems there's two definitions of "built-in")

if this is still valid for Python3, shouldn't we delattr() instead of setattr(, , oldfile) (even when oldfile is None)?

I agree, using delattr() would leave even less traces of the temporary change to __file__.

aundro commented 5 years ago

@marc-etienne what do you think about something like this?

@@ -453,8 +453,11 @@
     sys.argv = [ script ]

     # Adjust the __file__ path in the globals we pass to the script
-    old__file__ = g['__file__'] if '__file__' in g else ''
-    g['__file__'] = script
+    FILE_ATTR = "__file__"
+    has__file__ = FILE_ATTR in g
+    if has__file__:
+        old__file__ = g[FILE_ATTR]
+    g[FILE_ATTR] = script

     try:
         with open(script) as fin:
@@ -467,7 +470,10 @@
             print(PY_COMPILE_ERR)
     finally:
         # Restore state
-        g['__file__'] = old__file__
+        if has__file__:
+            g[FILE_ATTR] = old__file__
+        else:
+            del g[FILE_ATTR]
         sys.argv = argv

     return PY_COMPILE_ERR
marc-etienne commented 5 years ago

I guess that should work.

marc-etienne commented 5 years ago

Because g is a dict (I was under the impression it was a module), your patch is actually better than mine.

aundro commented 5 years ago

Submitted (e3d07f20186241a9781cd2ad0b9c1d0aa8b02490)

aundro commented 5 years ago

@marc-etienne Thank you very much for investigating this!