ccbrown / needy

A C++ library dependency helper.
https://ccbrown.github.io/needy/
MIT License
55 stars 4 forks source link

Add a means to get build configuration from a central repository #54

Open vmrob opened 8 years ago

vmrob commented 8 years ago

We have multiple projects and multiple libraries with similar (if not the same) build configurations:

libraries:
    openssl:
        repository: git@github.com:openssl/openssl.git
        commit: OpenSSL_1_0_1p
        project:
            build-steps:
                {% if platform == 'osx' %}
                - sh ./Configure darwin64-x86_64-cc --prefix={build_directory}
                - make install
                {% elif platform == 'ios' %}
                - sh ./Configure iphoneos-cross --prefix={build_directory}
                - sh -c 'make install CROSS_TOP=`xcode-select -p`/Platforms/iPhoneOS.platform/Developer CROSS_SDK=iPhoneOS.sdk'
                {% elif platform == 'iossimulator' %}
                - sh ./Configure iphoneos-cross --prefix={build_directory} -D__STRICT_ANSI__=1 no-asm
                - sh -c 'make install CROSS_TOP=`xcode-select -p`/Platforms/iPhoneSimulator.platform/Developer CROSS_SDK=iPhoneSimulator.sdk'
                {% elif platform == 'tvos' %}
                - sh ./Configure iphoneos-cross --prefix={build_directory} -DHAVE_FORK=0
                - sh -c 'make install CROSS_TOP=`xcode-select -p`/Platforms/AppleTVOS.platform/Developer CROSS_SDK=AppleTVOS.sdk'
                {% elif platform == 'tvossimulator' %}
                - sh ./Configure iphoneos-cross --prefix={build_directory} -D__STRICT_ANSI__=1 -DHAVE_FORK=0 no-asm
                - sh -c 'make install CROSS_TOP=`xcode-select -p`/Platforms/AppleTVSimulator.platform/Developer CROSS_SDK=AppleTVSimulator.sdk'
                {% elif platform == 'android' %}
                - sh ./Configure android --prefix={build_directory}
                - make install
                {% elif platform == host_platform %}
                - sh ./config --prefix={build_directory}
                - make install
                {% endif %}

It would be nice if we could back this configuration by a repository. It would be easy to store these externally using a git repository to back it:

Base project:

libraries:
    openssl:
        repository: git@github.com:my-org/needy-recipes.git
        commit: 1.0.0
        recipe: openssl.yaml

Then, in a separate repository:

openssl.yaml @ tag 1.0.0

libraries:
    openssl:
        repository: git@github.com:openssl/openssl.git
        commit: OpenSSL_1_0_1p
        project:
            build-steps:
                {% if platform == 'osx' %}
                - sh ./Configure darwin64-x86_64-cc --prefix={build_directory}
                - make install
                {% elif platform == 'ios' %}
                - sh ./Configure iphoneos-cross --prefix={build_directory}
                - sh -c 'make install CROSS_TOP=`xcode-select -p`/Platforms/iPhoneOS.platform/Developer CROSS_SDK=iPhoneOS.sdk'
                {% elif platform == 'iossimulator' %}
                - sh ./Configure iphoneos-cross --prefix={build_directory} -D__STRICT_ANSI__=1 no-asm
                - sh -c 'make install CROSS_TOP=`xcode-select -p`/Platforms/iPhoneSimulator.platform/Developer CROSS_SDK=iPhoneSimulator.sdk'
                {% elif platform == 'tvos' %}
                - sh ./Configure iphoneos-cross --prefix={build_directory} -DHAVE_FORK=0
                - sh -c 'make install CROSS_TOP=`xcode-select -p`/Platforms/AppleTVOS.platform/Developer CROSS_SDK=AppleTVOS.sdk'
                {% elif platform == 'tvossimulator' %}
                - sh ./Configure iphoneos-cross --prefix={build_directory} -D__STRICT_ANSI__=1 -DHAVE_FORK=0 no-asm
                - sh -c 'make install CROSS_TOP=`xcode-select -p`/Platforms/AppleTVSimulator.platform/Developer CROSS_SDK=AppleTVSimulator.sdk'
                {% elif platform == 'android' %}
                - sh ./Configure android --prefix={build_directory}
                - make install
                {% elif platform == host_platform %}
                - sh ./config --prefix={build_directory}
                - make install
                {% endif %}
ccbrown commented 8 years ago

I want to facilitate this, but I don't think this is something I want to build directly into Needy.

I'd rather do this in Jinja somehow, either through built-in functions or by making the renderer more extensible:

libraries:
    openssl: {{ recipe("git@github.com:my-org/needy-recipes.git", "openssl", "1.0.0") }}
ccbrown commented 8 years ago

Maybe all that's needed to facilitate this is just a more advanced loader:

libraries:
    {% with version="1.0.0" %}
    {% include "git@github.com:my-org/needy-recipes.git/openssl.yaml" %}
    {% endwith %}
vmrob commented 8 years ago

I think I really like the idea of letting the template engine take care of this, but I don't like the first suggestion--it doesn't look reusable enough.

Instead, I think the second example has the right goal in mind, but I think the syntax is off. Specifically, I think the language should be the same as it currently is in needy with the usage of commit. Likewise, I don't think we should attempt to modify the git url by tacking on a file path relative to the root. It should be a separate parameter.

Ideally, I think the template engine should use the same facilities needy already has to work with "sources".

{% with repository("git@github.com:my-org/needy-recipes.git", "OpenSSL_1_0_1p") %}
    {% include "openssl.yaml" %}
{% endwith %}
{% with download("http://iweb.dl.sourceforge.net/project/boost/boost/1.59.0/boost_1_59_0.tar.bz2", checksum="b94de47108b2cdb0f931833a7a9834c2dd3ca46e") %}
    {% include "boost.yaml" %}
{% endwith %}

One question I have is how the nesting should be handled. Do these lines go inside the libraries section? Outside? Is the definition contained inside an included yaml file merged with the current definition? Overwrite? Error on collision? Do local changes override previous definitions?

Initially, I think an error on collision makes sense, but it would also be nice to be able to override things:

{% with download("https://github.com/jedisct1/libsodium/releases/download/1.0.10/libsodium-1.0.10.tar.gz", checksum="f34f78330cf1a4f69acce5f3fc2ada2d4098c7f4") %}
    {% include "libsodium.yaml" %}
{% endwith %}

libraries:
    libsodium:
        configure-args:
            - --disable-shared
ccbrown commented 8 years ago

Well, I don't think that syntax is gonna work. I'm not interested in making modifications or extensions to the Jinja language.

My latter example has the really nice benefit of being 100% Jinja with no custom extensions, functions, built-in variables, etc. It comes very naturally to anyone that's used Jinja elsewhere since the only difference is that the include directive can handle URIs.

So the Jinja docs should be able to answer all of the questions you had. Here's an example:

Your needs.yaml:

libraries:
    {% with version="1.0.0" %}
    {% include "git@github.com:my-org/needy-recipes.git/openssl.yaml" %}
    {% endwith %}

openssl.yaml:

{% macro openssl_commit(version) -%}
    {% if version == "1.0.0" %}OpenSSL_1_0_0{% endif %}
{%- endmacro %}
{
    openssl:
        repository: git@github.com:openssl/openssl.git
        commit: {{ openssl_commit(version) }}
        project: ...
}

The idea here is that you wouldn't fetch the file from a particular commit on the remote repo. I don't think you would really want to do that anyways. If you did, you would end up maintaining a lot of tags or a lot of branches where only one library specification is intended to be used. I would suggest that you always pull the latest version of the specification, but explicitly specify the version of the library you want and any other relevant parameters.

So in that regard, the formulas would more like the way the SaltStack formulas work – You typically want to use the latest, but specify the application or library version you want along with relevant parameters. This does mean though, that you wouldn't be able to easily override things in the recipes unless the recipes specifically allowed it. But I'm definitely fine with that – In fact it would probably be a bad idea anyways to override bits of recipes that weren't meant to be overridden.

For your last example, I think that requires Needy to "do too much". It's certainly not behavior that looks at all natural to me, and I'm really not even sure what the contents of "yaml" would look like. It looks to me like it might require both modifications to the Jinja language and to the YAML parser (to do overrides).

I think my only real qualm regarding my last example is, as you said, using a non-standard git URI. I'd want to look around at what other projects are doing, to see if there's a more common way to include a file path in a git URI.

In summary:

ccbrown commented 8 years ago

To avoid doing non-standard things with Git URIs and be a little more flexible, maybe it would be better to just create a git_include macro:

libraries: {% with version="1.0.0" %} {{ git_include("git@github.com:my-org/needy-recipes.git", "openssl.yaml") }} {% endwith %}

If we did that, it may as well have an optional commit parameter too.

vmrob commented 8 years ago

Good points and thanks for the overview on what would be required with Jinja to make it work. That said, simplicity in the definition of the libraries is really important! I would be very happy if defining a macro wasn't required to have multiple versions of the library–that should be done with something that specializes in version control. I'll also definitely agree that the last of my examples requires needy to do a bit too much.

I think creating a git_include macro would be a great compromise between simple versioning and simplicity of the definition (not requiring macros to be defined in each versioned include). On a side note, can the indentation be corrected by the engine? It would be ideal not to have to worry about the nesting level.

openssl:
    repository: git@github.com:openssl/openssl.git
    commit: 1.0.0
    project: ...

To conquer unpinned recipe versions, it would be trivial to reference a branch instead of a tag. I'm on the fence about whether or not this should be considered best practice. I think my initial rule of thumb would simply be that if the commit property in the recipe references a branch, then the recipe should exist in a branch. If the commit property references a commit hash, it should also be pinned.

The only other variable here is what needy version of Needy to use. While Needy isn't quite at the point where it should be versioned, it's getting close and when it does get there, it's going to be important to be able to specify which version of Needy is compatible with a particular recipe.

I think in the end, the usage I'm hoping for is this:

needs.yaml

libraries:
    {{ git_include("git@github.com:my-org/needy-recipes.git", "v123.123.123", "openssl.yaml") }}

openssl.yaml

openssl:
    repository: git@github.com:openssl/openssl.git
    commit: 1.0.0
    project: ...
ccbrown commented 8 years ago

I would be very happy if defining a macro wasn't required to have multiple versions of the library–that should be done with something that specializes in version control.

Disagreement in what should be done. Like that, you'll either end up with one repo per library and more work than should be necessary to update that (merging updates to several branches if you support more than one version) or one library per repo and coupling / synchronization issues. We're bad enough at keeping our application / dependency versioning under control. That would just make that even harder.

For example...

libraries:
    {{ git_include("git@github.com:my-org/needy-recipes.git", "openssl_1.0.1", "openssl.yaml") }}

If you specify the library version via tag or branch, you end up with a single file in each branch of a sea of branches. You can upgrade it, but this is certainly not what Git specializes in – Browsing the recipes becomes awkward, and you completely lose the ability to share code between recipes.

libraries:
    {{ git_include("git@github.com:my-org/needy-recipes.git", "develop", "openssl.yaml") }}

If you specify your application version via tag or branch, you can put multiple recipes in the branch, and you can upgrade them, but changing the library version without introducing breakages requires you to make a new branch with a new name. It also means modifying this every time you cut release branches (or modifying this every time you tweak the library recipe in any way).

libraries:
    {% with version = "1.0.1" %}
    {{ git_include("git@github.com:my-org/needy-recipes.git", "openssl.yaml") }}
    {% endwith %}

By using the latest version of the recipe, and specifying the version you retain all of the benefits that you've lost above – You can share code, avoid coupling, not have to further complicate an already complicated branching / tagging workflow, and the cost is having to make more generic recipes. Your recipes will need a map or conversion function to turn library version numbers into commit hashes or tags. This cost, however, is actually a boon in my opinion. I think these recipes are better recipes, and mapping version numbers to commits isn't much of a downside if any. Worst case, you do what I demonstrated above, which could be made much more concise with an actual map: {"1.0.0": "abcdef123...", "1.0.1": "321fedcba...", ...} In many cases, libraries have tags that you can just get with a simple string operation.

So I would still encourage this approach... the SaltStack approach (with the option of multiple formulas in a repo). But this is a debate we can have internally if we use this.

On a side note, can the indentation be corrected by the engine?

No. This is the reason for the curly braces wrapping everything in my example openssl.yaml. If you use those, indentation doesn't matter.

In any case... It looks like a git_include macro would satisfy this issue:

macro git_include(repo, file, commit=None)

vmrob commented 8 years ago

Backing up to look at the big picture, multiple projects want to be able to use the same recipe for a common dependency. This doesn't imply that these products are necessary related, in fact, these products might be from different teams or different organizations (as would be the case if our company open sourced our recipes). Whether or not the with version= syntax is available, it also must be possible to point to a specific version (in the git repository commit hash sense) of a specific file. This is a requirement for security and assured backwards compatibility.

Is the above requirement agreeable? If so, then I don't think it matters to me how the alternative syntax is, because users will be able to chose whatever form they want to decide versions.

ccbrown commented 8 years ago

I think there may be some confusion over what my example with syntax does. The version there has no special meaning. It is just a Jinja way to pass variables into includes.

The git_include macro at the end of my last comment includes an optional commit parameter that you can give it.

vmrob commented 8 years ago

Then this discussion is about best practices, right? If everything else boils down to something that's already built and supported by the Jinja engine, then there's nothing left to talk about with respect to what should be added to Needy.

I believe we both agree that the git_include macro is at least a bare minimum.

ccbrown commented 8 years ago

Yeah though there are some questions of implementation... Should this really pull the most recent every time the template is rendered? Or should some sort of caching be done?

vmrob commented 8 years ago

Depends on whether or not a hash is specified. If a hash is specified, there's no reason to every pull again unless the hash changes. If a ref is specified, then there's no way to know if the ref changed without doing a remote update. There's also no way to know if the provided argument is a commit hash or a ref. In general, it's highly frowned upon to change remote tags, so we should probably treat tags like commit hashes.

At the very least, it should be a fast operation to update a bare, depth=1 repository, but those things will add up pretty quickly (~1s cursory testing). Perhaps three optional parameters commit, tag, and ref should be used. The third default argument would be ref, but if commit or tag were used by name instead, an early optimization could be made to reduce refresh time.

In other words, I don't like caching of content that's easily prone to change where change is make or break. I've been bitten more than once by build servers that don't update toolchains to their latest versions due to eager caching.

ccbrown commented 8 years ago

Yeah, the only difficulty would be in using a branch name. I don't think three parameters would be necessary though. We can easily tell after the first fetch what type of parameter was used.

ccbrown commented 8 years ago

I suppose the same problem is actually already present if you used a branch name as the needs commit as well.

Perhaps in both cases, branch names should just be forbidden.

vmrob commented 8 years ago

If branch names are forbidden, then it's a 100% pinned system. I don't think that's necessarily a good idea.

ccbrown commented 8 years ago

Perhaps not, but it might be a feature for later.