DUNE-DAQ / daq-buildtools

Make life for developers easier through a collection of CMake functions and scripts
0 stars 1 forks source link

Integrating release manifest file with quick-start.sh #32

Closed dingp closed 3 years ago

dingp commented 3 years ago

With first versions of DAQ release manifest files are ready in the daq-release repo, and the available parser script in python, it will be good to integrate these with quick-start.sh.

The parts where things will be changed in quick-start.sh include:

  1. Setting up products area from cvmfs where pyyaml is available, and then setup python3' andpyyaml`;
  2. download the manifest parser;
  3. git clone the daq-release package;
  4. Modify part where setup script is generated to use the parser script instead (either generate static setup commands in the setup script, or call the parser inside the setup script itself);
  5. Modify the part for checking out git packages by calling the parser script with --git-checkout option.
brettviren commented 3 years ago

I don't remember seeing any discussion of this so forgive my ignorance but what is a "release manifest file"?

It seems like it has UPS identifiers. Is it for setting up a bunch of UPS products? If so, why not use UPS to handle this with a UPS "umbrella" package?

alessandrothea commented 3 years ago

Hi Pengfei,

OK, so just a couple of preliminary comments waiting for the latest greatest commits to be available:

That's all for now. Waiting for an updated version of parse-manifest.py to comment further

dingp commented 3 years ago

Hi Pengfei,

OK, so just a couple of preliminary comments waiting for the latest greatest commits to be available:

  • I'm not sure that we've ever addressed the location of the manifest files, but I have always imagined them to be con cvmfs and not in a separate repo (that points to cvmfs in any case), at least in a first instance. Delivering them via a git seems an additional complication for this initial test.
  • I don't see any significant logical distinction between external and precompiled packages. Instead of specifying the different class of packages in a manifest file, it would seem more important to have a mechanism for a dependant manifest (e.g. daq-release) to be able to import a dependency (daq-externals) and so on.
  • I have a strong preference to avoid mixing compiled packages and source packages in the manifest at this stage. We can address how to best handle release builds in a second stage, and until then I'd rather not have references to sources.

I would remove the --git-checkout option.

dingp commented 3 years ago

I don't remember seeing any discussion of this so forgive my ignorance but what is a "release manifest file"?

The discussion was not carried out in the consortium previously. It started with how we should have a central source of truth to define what a release looks like.

The file appears to be UPS centric now. But we may maintain a similar file under other packaging system like spack: *product_path would then be path to a spack repo;

We will then provide a similar parse-manifest.py to generate spack.yaml manifest file to define the spack env. This can provide a unified user experience between using UPS and spack.

It seems like it has UPS identifiers. Is it for setting up a bunch of UPS products? If so, why not use UPS to handle this with a UPS "umbrella" package?

This is a great idea. I'll make an umbrella package. This does not preclude the necessity of manifest files though.

alessandrothea commented 3 years ago

We different views on the role of the manifest file and the parser at this stage. It would be good to come to a common understanding before going further.

From my perspective there are two distinct problem we need to solve

  1. consistently build a set of packages
  2. setup an environment based on a pre-defined package set.

At the moment we're trying to solve 2 using a manifest file. Whether 1. can be solved by the same file/format, this is a question for later. Therefore I would like to concentrate on solving 1. without no trace or reference to source code.

The organisation on packages in sets (system, externals, daq) becomes valuable if these sets are mapped into different manifests. YAML may not support natively, but it's an easy feature to implement in python.

Basically, the manifest file needs to be no more than a named list of versioned packages with a package path. And the need for the package path is debatable as it can be derived by the location of the manifest, if the appropriate conventions are used.

In any case the parser script needs to take care of the repository checkout

In practice, this is my suggested list of actions

The handling of manifests is sufficiently important that it's worth spending a bit of time to get it right. This doesn't need to stop the work on quick_start that can continue in parallel with a temporary workaround.

brettviren commented 3 years ago

@dingp and @alessandrothea thanks for both of your elaborations, I think I have a better undersanding now. I think a "single source of truth" pattern for our releases is an excellent idea.

I don't want to seem like I'm pushing anything but please let me mention two things that may be worth considering.

Jsonnet is a language meant to build data structure. It has an "include" mechanism to factor things into different files (something just mentioned as desired in another GH thread). And, it can be compiled directly to YAML (and JSON and others) and through templates can make essentially any format output. It has parsers in many languages including Python. Because of this, I think it makes a really good format to hold "SSOT" type info.

In Spack, the equivalent of a UPS "umbrella" is called a "bundle". Here's an example of a bundle in our spack evaluation which builds the current software stack:

https://github.com/DUNE-DAQ/dunedaq-spack/blob/master/packages/dunedaqapps/package.py#L30

Just like we would do in an UPS umbrella's "table" file, in spack we add sets of dependencies, each tied to one version of the "bundle" so installing it installs the rest. As you say, we can also "bake" this info into a Spack YAML config file. The difference would be that the config file is something that is made by the user while the "bundle" is "just a Spack package" that can be included in a spack repo along with the other "real" Spack packages. One could in fact have both patterns.

If either or both of these ideas are interesting to you, please let me know and I'd be happy to expand/prototype/etc. But also, please don't let these ideas derail any happy wheels already in motion!

dingp commented 3 years ago

@brettviren Thanks for the comments! Spack bundle and UPS umbrella package does seem to provide similar functionalities.

Currently, I am leaning to having both the bundle/umbrella package, and the manifest file. The manifest file may become optional. It will give developers finer control over packages either overloading with newer versions or loading additional packages. Developers can set up things quickly by loading just the bundle package or setting up the UPS umbrella package. If they want to try a new version of a dependent package, or if they need additional dependent packages set up, they can put it into the manifest file.

Additionally, the manifest file may have fields guiding the packaging system where to find the pre-built packages etc. We have a lot of flexibilities with the manifest file and its parser script. By having this layer in our release model, we can ensure a consistent user experience, in some sense by "hiding" the underlying details of the release and packaging system from the users.

dingp commented 3 years ago

I made a UPS umbrella package dunedaq v1_1_0 -q e19:prof and dunedaq develop -q e19:prof.

The former exists under /cvmfs/dune.openscience.org/dunedaq/DUNE/products, and the latter is under /cvmfs/dune.openscience.org/dunedaq/DUNE/products_dev.

Right now, this is a work-in-progress. We are looking into if having a single top-level umbrella package is enough, or if grouping the dependencies into multiple umbrella packages before making the top-level package would be a better approach.

dingp commented 3 years ago

@brettviren In my PR #34, I have two manifest files, the release manifest file and users' manifest file.

With the introduction of the umbrella package, the release manifest file may have its role moved to backstage, which is to help making the umbrella package. I discussed this a bit more in issue #2 in daq-release.

We can still keep the users' manifest file. It will fill the holes an umbrella package cannot fill:

  1. storing parameters related to users' own settings (product path, central repo location etc);
  2. overload an existing package with a different version or load additional dependent packages;
  3. store other user related settings (e.g. where the working directory, build/install/log directories are, which additional source repo to checkout, etc).
roland-sipos commented 3 years ago

Is this still open? Can someone please close it and mark v2 done?

dingp commented 3 years ago

Closing this issue for now since we moved away from quick-start.sh, and have a similar release manifest file as proposed in this issue.

We may come back to this issue since we do plan to improve the structure of the manifest file (and possibly move to the yaml format as proposed here).