AcademySoftwareFoundation / rez

An integrated package configuration, build and deployment system for software
https://rez.readthedocs.io
Apache License 2.0
958 stars 340 forks source link

Better Environment Variable control in Rex #737

Open bfloch opened 5 years ago

bfloch commented 5 years ago

(Originated from #707)

Settings and Ops

This is a rather complex topic since we want to support so many use cases so I'll build up to the suggestion based on every possible use case imaginable. It also gives us opportunity to migrate existing settings in a generic configuration.

I don't want to jump into implementation details yet (hope this is doable at all :) ), but let's see if we can come up with use cases best demonstrated via an sample.

References

703

698

661

(more ?)

Use cases

(cases handled by the sample are crossed)

Sample

(Note: This is a completely artificial example to showcase many possible use cases)

from rez.util.config import PlatformDependent # PR 739

env_var_settings = {

    # -------------------------------------------------------------------------
    # Global control inherited(!) by all other variables.

    "*": {

        # Migration of all_parent_variables
        #
        # Full control over variables is achieved given the following parent_variables & actions:
        #
        # clear (No parent variables)
        #  [prepend_all] [prepend_parent->FAILS] [append_parent->FAILS] [$REZ_VARS] [append_all]
        #
        # prepend (parent before rez variables)
        #  [prepend_all] [prepend_parent] [$PARENT_VARS] [append_parent] [$REZ_VARS] [append_all]
        #
        # append (parent after rez variables)
        #  [prepend_all] [$REZ_VARS] [prepend_parent] [$PARENT_VARS] [append_parent] [append_all]
        #
        "parent_variables": "clear"

        # Migration of all_resetting_variables
        # Note: Actually had a problem with understanding the naming, but kept for consistency
        "resetting_variables": False

        # Any variable (!!!) on this level may be "its expected type" OR dict of 
        # ("platform": "its expected type").
        # Note we are only handling platform.system differences.
        "pathsep": PlatformDependent(
            {
                "windows": ";",
                "linux":   ":",
            },
            # Any other platform, except specific one …
            default= "🍎"
        ),
        # Default append, prepend and set operation
        "modify_op": "nativepath",
        # Interestingly this could now also be implemented as followed :-)
        # but we probably still need nativepath on commands() side.
        "modify_op": PlatformDependent(
        {
            "windows": "windowspath",
        },
        default="posixpath"
        ),
    },

    # -------------------------------------------------------------------------
    # Creation of variables without having to rely on bootstrap file or package
    # (Studio must make decision of when to use implicit packages vs. rezconfig)

    # Set custom variables without package.

    "STUDIO_REZCONFIG_VERSION": {
        "parent_variables": "clear",
        "prepend_all": "2019.08.23.2",
    },

    "STUDIO_STORAGE": {
        "parent_variables": "clear",
        "prepend_all": PlatformDependent({
            "windows": "P:\\",
        },
        default="/mnt/production"
        )
    },

    # Sample where a user might run a local STUDIO_CACHING_SERVER
    # [Parent] [This] [Rez]
    "STUDIO_CACHING_SERVERS": {
        "parent_variables": "prepend",
        "append_parent": ["192.168.2.2", "cache.studio.com"], # Note multiple things as list
    },

    # Sample where user might override for debugging, rez is default authority but
    # ultimately use sensible default
    # [Parent] [Rez] [This]
    "STUDIO_LOCATION": {
        "parent_variables": "prepend",
        "append_all": "UNKNOWN",
    },

    # Artificial case where REZ packages always have priority, but we guarantee a fallback
    # coming from Parent Environment or This config
    # [Rez] [Parent] [This]
    "STUDIO_USER_GROUPS": {
        "parent_variables": "append",
        "append_parent": PlatformDependent({
            "windows": ["UPDATING"],
            "linux": ["WORKING"],
            "osx": ["BROKE"],
        })
    },

    # -------------------------------------------------------------------------
    # Common use cases
    # We now can control the "always-on" tools and their order very precisely

    "PATH": {
        # In this sample: Rez first
        "parent_variables": "append",

        "prepend_all": PlatformDependent({
            "windows": "C:\\ImportantTools",
        },
        default="/opt/important_tools"
        ),

        # Here after the system paths
        "append_all": PlatformDependent({
            "windows": ["C:\\WindowsFallbackTools"],
        }),
    },

    "PYTHONPATH": {
        "parent_variables": "clear",
    },

    "CMAKE_MODULE_PATH": {
        "parent_variables": "clear",

        # Nice.
        "pathsep": ";",
        "modify_op": "posixpath",
    },

    "PATHEXT": {
        "pathsep": ";",
        "parent_variables": "prepend",
        "append_parent": PlatformDependent({
            # Does not pollute other platforms (Not sure if this is true, yet)
            "windows": [".PY", ".PS1"],
        },
        default=None
        )
    }
}

Naming, features and details are like always up for discussion. Let me know if I missed anything.

bfloch commented 5 years ago

On platform dicts

EDIT: Wrote a lightweight factory / not type. See post below.

I really like the idea of the platform dependent dictionary. One problem with this is that if the original value is a dict it gets hard to identify as such. It is also hard to verify the schema. I believe there would be value in having this expanded as a feature for any config in rezconfig. I suggest to implement this separately with a special type that behaves like a dict but is explicit.

For example:

from rez.utils.config import PlatformDependent
default_shell = PlatformDependend({
    "windows": "pwsh",
    "*": None # Default
})

If all of you agree I move this in a separate issue to discuss schema handling etc.

On value deduplication

Deduplication is shell dependent. As a proof of concept we already have this in PATHEXT on the windows platforms. It looks something like this: https://github.com/nerdvegas/rez/pull/698/commits/a3f21db6f308d0130782960a2657dabf3f322d3c

The drawback is that it is evaluating logic and you would do this on each append. An alternative would be to do a single final deduplication pass. I haven't done this on cmd but I guess it would be possible via techniques like:

for /f "tokens=2 delims=:" %%A in ("%%f") Do @Echo %%A

https://superuser.com/questions/1407317/how-to-split-variable-by-colon-in-batch

The idea is that you set it per variable:

"PATH": {
    "deduplicate": True,
    # ...   
},

I am not sure it is necessary to distinguish between deduplication that keeps the earliest, initial (hard as final op) or latest duplicate. I would just go with the earliest for now. Is this sufficient? The final deduplication approach is a bit different when the variable is being referenced along the way, but I guess it is acceptable. Many options. Curious for feedback.

bfloch commented 5 years ago

The platform dependent configuration was way simpler and cleaner to implement then I thought. Please checkout: https://github.com/nerdvegas/rez/pull/739

EDIT: Updated to sample with #739 in place

nerdvegas commented 5 years ago

I think this is good.

One thing I'd say though is that I found the "parent_variables" terminology confusing. I think it's because 'append/prepend' s being used to mean slightly different things - ie, append as in add a path to the end of a set of paths; but also, append as in, append the SET of pre-existing paths to whatever paths rez constructs.

What might be better is "modify_mode" instead (which sits alongside modify_op). Values would be:

Similarly, I think changing all the terminology like "append_all" etc to "after_all", "before_parent" etc would help - it's more intuitive because it describes where an item should appear in a list. Because append/prepend are verbs, it's a bit confusing in the context of multiple different sets of paths (if that makes sense).

But these are just terminology quibbles, overall I think the structure and approach is fine.

RE deduplication: I would dedup along the way, better that the var value be as expected at all times. I would agree though that keeping earliest value is probably good enough, I really can't think of a case where you wouldn't want that.

One thing to consider is what the "parent" value of a var really is. IMO it would be the pre-rez value of the var - but you may have nested rez-envs too, that will have to be taken into account (in that case, I think it should still be the first pre-rez value that is used).

A