hauntsaninja / pyp

Easily run Python at the shell! Magical, but never mysterious.
MIT License
1.41k stars 39 forks source link

Submodule Import Problems #34

Open tinhb opened 1 year ago

tinhb commented 1 year ago

Suppose I have a string of HTML content and would like to extract certain information from it:

pyp "xml.etree.ElementTree.fromstring('<html><head><title>Title</title></head></html>').find('head/title').text"

Even though pyp tried to import xml, there will still be AttributeError: module 'xml' has no attribute 'etree' because of xml.etree.ElementTree’s submodule structure.

I can explicitly use -b parameter for proper importing:

pyp -b "import xml.etree.ElementTree" "xml.etree.ElementTree.fromstring('<html><head><title>Title</title></head></html>').find('head/title').text"
# Title

However, if I add the same line to the PYP_CONFIG_PATH config file, the same AttributeError happens still.

cat $PYP_CONFIG_PATH
# import xml.etree.ElementTree
pyp "xml.etree.ElementTree.fromstring('<html><head><title>Title</title></head></html>').find('head/title').text"
# AttributeError: module 'xml' has no attribute 'etree'

So, the question is: What is the correct way to have xml.etree.ElementTree imported automatically?

hauntsaninja commented 1 year ago

Thanks for the issue!

Yeah, it's hard to know statically what to import when you see an expression like "xml.etree.ElementTree". Not sure I see a way to get that to work out of the box without special casing.

Hm, adding that line to your config should work... we should treat the config element "import xml.etree.ElementTree" as defining "xml" and so statically include it in the code we execute. It looks like that's not happening and so it's falling back to the generic import missing things code. This is a bug, I can fix it.

In the meantime, a workaround could be something like adding from xml.etree import ElementTree to your config and using ElementTree. Similarly, adding import xml.etree.ElementTree as ET to your config and using ET would also work.

hauntsaninja commented 1 year ago

The commit that I just pushed https://github.com/hauntsaninja/pyp/commit/a3f2ebcf3ad48de1399b90c7f1029a3045fea0d1 makes your config example work. I'll see if I can think of improvements that would make your initial version work as well (that are compatible with pyp's mostly static analysis)

tinhb commented 1 year ago

Wow, thank you for the quick fix.

I searched the internet for Python module resolution and found importlib.util.find_spec. Not sure if it can be used.

Say, if I want to use another call xml.dom.minidom.parse(...), is it possible to search level by level?

from importlib.util import find_spec
def spec_of(target):
    spec = find_spec(target)
    return (spec, spec.submodule_search_locations) 

# spec_of('xml.dom.minidom.parse')[1]     # Exception, __path__ not found on xml.dom.minidom (parent?)
# ModuleNotFoundError: __path__ attribute not found on 'xml.dom.minidom' while trying to find 'xml.dom.minidom.parse'

spec_of('xml.dom.minidom')[1] is not None # False, no submodule? (pure guess)
spec_of('xml.dom')[1] is not None         # True, has submodule? (ditto)
spec_of('xml')[1] is not None             # True, has submodule? (ditto)
hauntsaninja commented 1 year ago

Yeah, let me think about how to better make some of this stuff work. Not straightforward given the current implementation and static constraints.

In the meantime, imports like import xml.etree.ElementTree as ET / from xml.etree import ElementTree will work without issue.