Module mixup with same named files.

tokejepsen commented 8 years ago

Description

If you have files named the same, the associated imported modules gets mixed up.

Expected results

Would expect to be able to have same named files, and get the correct imports.

Short reproducible

To reproduce we need three files:

publish.py (file)
- first_plugin (directory)
  - collect.py (file)
- second_plugin (directory)
  - collect.py (file)

publish.py

import os

import pyblish.util

os.environ["PYBLISHPLUGINPATH"] = (os.path.join(os.path.dirname(__file__),
                                                "first_plugin") +
                                   os.pathsep +
                                   os.path.join(os.path.dirname(__file__),
                                                "second_plugin"))

context = pyblish.util.publish()
for result in context.data["results"]:
    for record in result["records"]:
        print record.getMessage()
    if result["error"]:
        print result["error"]

first_plugin/collect.py

import os

import pyblish.api

class FirstCollect(pyblish.api.ContextPlugin):

    order = pyblish.api.CollectorOrder

    def process(self, context):

        self.log.info("Hello from first plugin!")
        self.log.info(os.path)

second_plugin/collect.py

import pyblish.api

class SecondCollect(pyblish.api.ContextPlugin):

    order = pyblish.api.CollectorOrder

    def process(self, context):

        self.log.info("Hello from second plugin!")

When you execute publish.py Pyblish will execute second_plugin/collect.py then first_plugin/collect.py. I think since Pyblish have registered the collect module first with second_plugin/collect.py, that is also the import modules it'll get when executing first_plugin/collect.py, hence the missing os module.

mottosso commented 8 years ago

Sorry, what was the error?

tokejepsen commented 8 years ago

Good point. Here is the expected output from the reproducible:

Hello from second plugin!
Hello from first plugin!
'NoneType' object has no attribute 'path'

mottosso commented 8 years ago

Ah I see.

Yes this is a tricky one. The reason is due to the fact that plug-ins are "imported" anew on each discovery. The standard Python mechanism would require you to reload the module explicitly, or restart the interpreter. But we don't want that, because it would prohobit efficient development of plug-ins in hosts, such as Maya.

A way to work around it, is by putting your imports in your process() method.

def process(self, context):
  import os
  self.log.info(os.path)

The difference here is that it is re-imported everytime the method is called, as opposed to just once during import. Which ultimately isn't very Pythonic either.

We might have to choose, unless anyone can think of another way to approach it.

The underlying problem lies in the fact that Python caches all imports, regardless of whether they are done within the current session, or through a module you imported. In this case, you didn't import os, an imported module did. Yet os was still cached.

Ideally what would happen is that when you load a module, such as a plug-in, you would load it into an isolated environment. An environment you could later flush, including inner-imports like os.

But I don't know how to accomplish that. The only way I can think of is to keep tabs on what imports are made within a plug-in, and make sure to remove that from sys.modules along with the original plug-in. But the edge case there is that if a plug-in is the first to import os, but not the last, then os would have been removed from whomever imported it second. And tracking that is a rabbit hole with no end in sight; at least to me.

So, our options.

Find a way around said rabbit hole
Embed imports with process()
Avoid same-name files
Remove automatic reload

tokejepsen commented 8 years ago

Embed imports with process()

Seems like a good initial solution, though it seems like something you have to (un)teach people and document when people are learning Pyblish. For it to truely work across the board every plugin have to do this.

Is there not a workaround of indirectly doing this? Store all a plugins modules in the class it self.

Avoid same-name files

Definitely the easiest and quickest solution by far, and I've myself done this. But if we have lots of packages (in the future with a package-manager?) and plugins, this could happen more often.

Remove automatic reload

Not an option in my opinion :smile:

iwootten commented 7 years ago

Sorry to dredge up an old issue, but I'm confused - how come the pyblish plugin imports aren't namespaced? Wouldn't that resolve this issue?

mottosso commented 7 years ago

Hi @iwootten, what would be their namespace? pyblish.my_plugin?

iwootten commented 7 years ago

Well, my thought that the name of the project/folder it's being imported from makes sense? This is the standard approach with using other python modules is it not?

bumpybox.plugins.maya.my_plugin?

mottosso commented 7 years ago

That could be troublesome, plug-ins don't necessarily reside within a Python package, but could reside in a plain directory.

tokejepsen commented 7 years ago

Could we have a namespace of the directory?

C:\path\to\plugin.py > path.to.plugin /path/to/plugin.py > path.to.plugin

mottosso commented 7 years ago

We could, but we would be excluding my friend Здравствуйте from Russia, and chào bạn from Vietnam, the paths on their disks aren't importable like those of us with only ASCII letters in our name.

tokejepsen commented 7 years ago

You certainly would exclude them, but would they even be able to import their plugins? If their paths is non-ascii, their plugin names would be too.

mottosso commented 7 years ago

Yes of course they would.

$ cd c:\users\Здравствуйте\github\pyblish-base
$ python -m pyblish --help

Or how about:

$ set PYTHONPATH=c:\users\Здравствуйте\pythonpath
$ python -m nose

Everything on their system is prefixed with this god awful path. I think Windows might even support unicode.

c:\☺\☻\✌\pythonpath

tokejepsen commented 7 years ago

That doesn't illustrate when a plugin is named with non-ascii.

Got a plugin named Здравствуйте.py on my desktop.

(tgbvfx-environment) C:\Users\tje\Documents\conda-git-deployment>python -m pyblish --add-plugin-path "C:\Users\tje\Desktop"
Traceback (most recent call last):
  File "C:\Users\tje\Documents\conda-git-deployment\windows\miniconda\envs\tgbvfx-environment\lib\runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "C:\Users\tje\Documents\conda-git-deployment\windows\miniconda\envs\tgbvfx-environment\lib\runpy.py", line 72, in _run_code
    exec code in run_globals
  File "c:\users\tje\documents\conda-git-deployment\repositories\tgbvfx-environment\pyblish-base\pyblish\__main__.py", line 10, in <module>
    cli.main(obj={}, prog_name="pyblish")
  File "c:\users\tje\documents\conda-git-deployment\repositories\tgbvfx-environment\pyblish-base\pyblish\vendor\click\core.py", line 488, in __call__
    return self.main(*args, **kwargs)
  File "c:\users\tje\documents\conda-git-deployment\repositories\tgbvfx-environment\pyblish-base\pyblish\vendor\click\core.py", line 474, in main
    self.invoke(ctx)
  File "c:\users\tje\documents\conda-git-deployment\repositories\tgbvfx-environment\pyblish-base\pyblish\vendor\click\core.py", line 732, in invoke
    return Command.invoke(self, ctx)
  File "c:\users\tje\documents\conda-git-deployment\repositories\tgbvfx-environment\pyblish-base\pyblish\vendor\click\core.py", line 659, in invoke
    ctx.invoke(self.callback, **ctx.params)
  File "c:\users\tje\documents\conda-git-deployment\repositories\tgbvfx-environment\pyblish-base\pyblish\vendor\click\core.py", line 325, in invoke
    return callback(*args, **kwargs)
  File "c:\users\tje\documents\conda-git-deployment\repositories\tgbvfx-environment\pyblish-base\pyblish\cli.py", line 196, in main
    available_plugins = api.discover(paths=plugin_paths)
  File "c:\users\tje\documents\conda-git-deployment\repositories\tgbvfx-environment\pyblish-base\pyblish\plugin.py", line 1263, in discover
    module = types.ModuleType(str(mod_name))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-11: ordinal not in range(128)

tokejepsen commented 7 years ago

So we could have the path, but just exclude the non-ascii parts of the path?

mottosso commented 7 years ago

That doesn't illustrate when a plugin is named with non-ascii.

Plug-ins can't have non-ascii letters regardless. I thought we were talking about namespaces for plug-ins, and how the namespace could be based on their directory path. But since the directory path could contain letters not compatible with Python (or even most keyboards), that isn't a viable option.

So we could have the path, but just exclude the non-ascii parts of the path?

I get there must be some issue you are facing currently which would resolve itself given this measure, but for the project and everyone using it it simply isn't waterproof enough; it invites further issues down the line, issues that may be very difficult to both report and reproduce. As are all that depend on a given system or region configuration.

What we could do is simply append a randomised string to the name internally; the name of the plug-in internally is entirely synthetic. If you see here you'll find that we are the ones that chooses the name. Python merely caches it based on what name we give it, so if it were unique it would never be discarded.

To avoid filling up the memory with discarded plug-ins, we could make the randomised prefix based on the path and thereby avoid any issue of regionality.

tokejepsen commented 7 years ago

To avoid filling up the memory with discarded plug-ins, we could make the randomised prefix based on the path and thereby avoid any issue of regionality.

So for each unique plugin path, we prefix with maybe UUID?

tokejepsen commented 7 years ago

Or maybe simply prefix with an incrementor like "pyblish_path_1"?

tokejepsen commented 7 years ago

I'm currently running into this issue on revising some plugins.

@mottosso was none of my suggestions any good?

mottosso commented 7 years ago

@mottosso was none of my suggestions any good?

They are close, but not quite there yet. What we need is something that is a situation where the name of each plug-in is unique from each other, but we can't assign it something fully random as it would never get garbage collected.

We need something where reloading each plug-in discards the old one, but that still is unique where file names are the same.

You could have a look at generating a unique ID based on the full path to the plug-in and serialise it into something compatible with Windows, OSX and Unix file systems.

That way it'd be as unique as the path and be reproducible across runs. I think one of the uuid options offer something like that, otherwise you might have more luck running the path through a regular expression to narrow it down into compatible characters. Just keep in mind unicode and other languages in that two parent folders called "swäden" and "sweden" would be identical when you replace the "ä" with an "a".

iwootten commented 7 years ago

I wonder if we need a better way of configuring which plugins should be used within pyblish over and above a simple list of directories set by the environment variable?

This seems to be a common theme with the work @tokejepsen and I are sharing - I'd like the ability to include some plugins, ignore others for instance. Basically a way of configuring the complete plugin flow myself. I could go ahead and register each plugin individually, but then having an official solution would be good too.

mottosso commented 7 years ago

Thanks @iwootten. Could we open up a separate issue about flow control?

iwootten commented 7 years ago

Sure thing, done so here: https://github.com/pyblish/pyblish-base/issues/316

pyblish / pyblish-base

Module mixup with same named files. #306