Provide public package / determine public use cases

anaconda-graveyard / conda-concourse-ci

Conda-driven Concourse CI for package building

BSD 3-Clause "New" or "Revised" License

13 stars 29 forks source link

Provide public package / determine public use cases #71

Open msarahan opened 6 years ago

msarahan commented 6 years ago

As mentioned at https://github.com/conda-forge/staged-recipes/pull/5274#discussion_r171128023, c3i is not really meant to be an internal project. It has been that way, but if it's useful, we'd like to make it available for external use.

However, in order to do so, we need to get an idea of what exactly people want to do. Because Conda has lacked a well-specified API, it has been bitten badly in the past by external projects reaching into its internals. This has resulted in a lot of community churn when implementation details in conda change.

This issue is intended to collect user stories and use cases for the initial public offering of c3i, which will guide the public-facing API.

isuruf commented 6 years ago

conda-forge's use case is to build a set of recipes in a folder with inter-dependencies. I can get the graph of dependencies from construct_graph, sort it topologically and build them in order.

G = construct_graph(recipes_dir, worker=worker, run='build', conda_resolve=conda_resolve,
                        folders=folders, config=config, finalize=True)
order = list(nx.topological_sort(G))
order.reverse()
for node in order:
    conda_build.api.build(G.node[node]['meta'])

msarahan commented 6 years ago

Public functionality:

given a list of folder names, a platform, and possibly a conda_build_config.yaml file, render recipes in provided folders, and build a directed graph of metadata objects
collapse the graph of metadata objects so that each output gets collapsed into a node representing its parent. The fundamental build target is the folder, so this is what gets fed to future build commands, rather than the metadata object for the output.
Detect changed folders within some collection of recipe folders or subfolders

(Probably) internal functionality:

output concourse plans for job execution
upload & interface with concourse server for job management

msarahan commented 6 years ago

@isuruf, great. That's basically my first 2 entries. You guys haven't run into the need to collapse outputs into their parent node, but as more recipes use multiple outputs (and as you merge our recipes back into CF), this will be necessary.

msarahan commented 6 years ago

For you, arguably, you could just specify the list of recipes and get a build order out. Do you even need to know about the graph behind the scenes? Maybe there's another api function for getting the graph, for the sake of plotting or something, but I think I'd like to hide it for your use case.

isuruf commented 6 years ago

Do you even need to know about the graph behind the scenes?

No. Just the build order is enough.

msarahan commented 6 years ago

OK. I'll write a proposed API tomorrow and run it by the team here. It might be worth waiting for it before merging the staged-recipes change, but that's up to you.

msarahan commented 6 years ago

In discussions with the team, we think long-term that this graph computation kind of thing belongs in conda-build itself. It is extremely tightly coupled with conda-build, and anyone who would use such a tool would need to have conda-build installed with it anyway.

In the interest of iterating on it more quickly, we propose spinning out a separate package that does the graph building and manipulation. When we understand the problem domain well enough from using that package, we'll roll it into conda-build, associated with a major version bump.

Thanks @jjhelmus for comments. Pinging @cj-wright so he's aware of this.

CJ-Wright commented 6 years ago

Thank you for pinging me! The graph is going to be the graph for all the recipes in a given build folder? Or is this the entire dep graph?

msarahan commented 6 years ago

It'll be for whatever folders you feed it. If you feed it all folders, it'll be the whole dep graph. I will say that it's pretty slow. We do a lot more rendering than you do with your graph stuff. We'll have to see how to speed it up, and also how to update some larger serialized graph incrementally, then serialize it again.

It remains to be decided what information people need, though. For example, for the full graph, do you want to see dependencies on outputs of recipes, or just collapse them into the parent recipe? From a package build standpoint, all you care about is the latter. For a more accurate understanding of the web of connections, you want the former.

jakirkham commented 6 years ago

Possibly of interest @cpcloud (has been working on scourge, which might benefit from this).

isuruf commented 5 years ago

Any updates on this?