SolidCode / SolidPython

A python frontend for solid modelling that compiles to OpenSCAD
1.1k stars 171 forks source link

import_scad() imports SCAD objects each time it is called, leading to duplication #172

Closed jeff-dh closed 3 years ago

jeff-dh commented 3 years ago

if you execute the following solidpython code:

from solid import scad_import

mcad1 = scad_import("MCAD")
mcad2 = scad_import("MCAD")

The OpenSCADModule MCAD gets imported twice. This seems to be no big deal on the first sight, but if you're working on a project that consists of multiple files and each file imports a OpenSCAD library of a certain size this costs quite some time (and pollutes the python "space" for no reason).

In my case, MCAD gets imported 5 times (because my assembled object is split into 5 (sub)parts and each resists in a file of its own and imports MCAD).

This issue can be solved in "user-space" by wrapping all scad_import calls into a python module:

#libs.py
mcad = import_scad("MCAD")
bosl = import_scad("BOSL")

#regular_module.py
from libs import mcad
#do something with mcad

.......but....... I think this is actually a SolidPython issue.....

I think SolidPython should "cache" imported modules and return either a reference or a copy of the cached module (not sure about this).

The only way I can think of to solved this is with a global (module wide) dict every imported module gets registered and if a module gets imported (again) you just do a lookup in the dict first.

Another immature idea I had was to add all imported libraries into the solid namespace.

import_scad("MCAD")

solid.libs.MCAD.motors.whaterevermotor(17)

import_scad could still return a reference to solid.libs.mcad to stay backward compatible.

I think this could be cool for libraries but might not be what you want when you import a local scad file a single time, this would need another function

But I like the later syntax more :D

Any opinions and / or ideas?

etjones commented 3 years ago

You're right that this is an issue, and that is should be solved in SolidPython rather than in a module-style workaround. I'd have to look up the exact mechanics of Python's import mechanism, but I think that should be the model. That's basically what you're describing in your first proposed solution. The Python runtime keeps a global dictionary of imported modules so that if import os is in every module you use, that code only gets imported once and otherwise is a dictionary lookup.

There are some corner cases we might have to look at, like whether the following should do one import or two:

mcad1 = import_scad('MCAD') 
mcad2 = import_scad('<$SCAD_PATH>/MCAD')

(I think two, and I think this is how Python's imports work, keyed by how an import is done rather than the resultant module)

In any case, yeah, I think there should be a global dictionary in solid.objects that's referenced in import_scad(). Should be a quick fix

etjones commented 3 years ago

I'll need to test this on a separate branch from the other work, but I think this should do the trick. Can you see any problems?

diff --git a/solid/objects.py b/solid/objects.py
index 2307e80..8d74f64 100644
--- a/solid/objects.py
+++ b/solid/objects.py
@@ -22,6 +22,8 @@ Indexes = Union[Sequence[int], Sequence[Sequence[int]]]
 ScadSize = Union[int, Sequence[float]]
 OpenSCADObjectPlus = Union[OpenSCADObject, Sequence[OpenSCADObject]]

+IMPORTED_SCAD_MODULES: Dict[Path, SimpleNamespace] = {}
+
 class polygon(OpenSCADObject):
     """
     Create a polygon with the specified points and paths.
@@ -761,15 +763,23 @@ def import_scad(scad_file_or_dir: PathStr) -> SimpleNamespace:
         OpenSCAD files. Create Python mappings for all OpenSCAD modules & functions
     Return a namespace or raise ValueError if no scad files found
     '''
+    global IMPORTED_SCAD_MODULES
+    
     scad = Path(scad_file_or_dir)
     candidates: List[Path] = [scad]
-    if not scad.is_absolute():
-        candidates = [d/scad for d in _openscad_library_paths()]

-    for candidate_path in candidates:
-        namespace = _import_scad(candidate_path)
-        if namespace is not None:
-            return namespace
+    ns = IMPORTED_SCAD_MODULES.get(scad)
+    if ns:
+        return ns
+    else:
+        if not scad.is_absolute():
+            candidates = [d/scad for d in _openscad_library_paths()]
+
+        for candidate_path in candidates:
+            namespace = _import_scad(candidate_path)
+            if namespace is not None:
+                IMPORTED_SCAD_MODULES(scad) = namespace
+                return namespace
     raise ValueError(f'Could not find .scad files at or under {scad}. \nLocations searched were: {candidates}')
etjones commented 3 years ago

Resolved & added unit test for this. Note that we could re-import SCAD modules if they're called with equivalent-but-not-identical paths, like:

mod1 = import_scad('MCAD')
mod2 = import_scad('MCAD/../MCAD')

This doesn't seem like a problem and maybe is desirable?

jeff-dh commented 3 years ago

I think(!) this is not "correct". I think Python works not this way and I don't think this is what you want ;-p

I guess the resolved filename (candidate_path) as key is the right way to go.

I don't have a lot time right now, we can discuss this out later (>20h).

jeff-dh commented 3 years ago

that's how -- I guess -- (almost) everything works (filesystems, url lookups, c includes,.....no matter what "path" you use to reference a file, it's the same file if the resolved (absolute) path is the same).

In Python it becomes crazy because you can change the modules and the changes will be "global":

#file a.py
import sys
sys.blablub = 5

#file b.py
import a
import sys
print(sys.blablub)
print(a.sys.blablub)

prints out 5 twice....

etjones commented 3 years ago

sys.modules is keyed by the import string used to import a file (e.g. solid.utils), and not by the final path itself; this means you don't need to do the path search to resolve which file solid.utils should use, and that's my justification for using the scad argument as the key rather than the final unique path, since that would cause every import to recalculate all the possible paths the import argument could mean.

Now, what I can't figure out is if there's any problem with having multiple instances of a module if you want to do something weird like the MCAD/../MCAD example I used above. I guess if you were playing weird games with module import paths and if you were making dynamic changes to those imported SCAD modules and if you were counting on global changes to monkey-patched modules, you could have a problem. But... you shouldn't really do any of those things ;-)

Usually when I ask 'What could go wrong?', it's clear in my mind that plenty of things could go wrong. This time, I'm actually serious. What could go wrong?

jeff-dh commented 3 years ago

But... you shouldn't really do any of those things ;-)

Usually when I ask 'What could go wrong?', it's clear in my mind that plenty of things could go wrong. This time, I'm actually serious. What could go wrong?

Yeah I agree (more or less ;)

The only thing that could "go wrong" would be to waste time importing a module twice. To me that's worse than resolving a filename multiple times but that might be a matter of taste. A solution that would meet both could be to add two dicts. The first does exactly what it does in your patch. If that dict returns none, we could resolve the filename and check in another dict that uses the resolved path as key. That would only resolve the filename if there's no module with the same name and it would only import a module if there's no module with the same name nor with the same resolved filename. But I agree it probably doesn't matter because usually you should import a library always by the same path.

Off-Topic: Regarding Pythons import mechanism: I can't believe python handles it that way. But I actually don't know.

[10 mins later....] ............I justed tested it and it really seems you're right.......... .that seems really strange to me and I think it creates weird side-effects (e.g. a couple of modules can import the same module but use different sets of global variables ?!?!?!? and you don't even know with which of the other modules you share your set of globals unless you check their code.... I could probably receive a private copy of a module (and globals) by importing it on uncommon ways....... how often does your global code get executed (initialization of some global variables).... what about singletons...........)

#solid/abccc.py
print("abc")
a = 5

#solid/__init__.py
from . import abccc

#shell in SolidPython root dir
>>> import solid
abc
>>> import sys
>>> sys.path = ["solid/"] + sys.path
>>> solid.abccc.a = 2
>>> import abccc
abc
>>> abccc.a
5
>>> solid.abccc.a
2

WTF?