ctuning / ck-env

CK repository with components and automation actions to enable portable workflows across diverse platforms including Linux, Windows, MacOS and Android. It includes software detection plugins and meta packages (code, data sets, models, scripts, etc) with the possibility of multiple versions to co-exist in a user or system environment.
https://github.com/mlcommons/ck
BSD 3-Clause "New" or "Revised" License
72 stars 25 forks source link

A cumulative September'2020 update from dividiti #108

Closed ens-lg4 closed 4 years ago

ens-lg4 commented 4 years ago
  1. September:

    • A new package for building CMake from source
    • Support for a new key "pass_matching_tags_to" that provides an easy way to synchronize otherwise separate dependencies (for example, a model with a specific image dimensions with a dataset with the same image dimensions)
  2. September:

    • A new interface method in module:misc to return the list of supported variation tags (returns a list and optionally prints the same list if in console mode)
  3. September:

    • A new multi-version package for installing pre-built cmake for different platforms
  4. September:

    • New soft:lib.gtest and package:lib-gtest to support detection and installation of the GoogleTest library
  5. September:

    • Bugfix: retain compatibility with Python2, just in case
gfursin commented 4 years ago

Hi Leo, Most of it looks fine. I just have two questions:

1) may I ask you to give an example of "pass_matching_tags_to" usage? I think it's a useful functionality and I would like to understand it a bit better. From what I see, it's backwards compatible so it is just for my knowledge...

2) File module/env/module.py Line 1587. Are you sure that 'tags' is always in dd? I just don't remember ;) ... I usually do dd.get('tags',[]) just in case.

Thanks!

ens-lg4 commented 4 years ago

@gfursin , thanks for the review!

  1. Most programs in ck-mlperf and some in ck-tensorrt now use this mechanism. The principle is very simple: one of the dependencies that resolves earlier (in the sort order) is allowed to add one or more tags to any other dependencies to be resolved later. This gives us an opportunity to constrain the later-resolving dependencies.

For example, in https://github.com/dividiti/ck-mlperf/blob/master/program/image-classification-tflite-loadgen/.cm/meta.json the weights dependency (with sort order 30) will resolve first, and it will then take a tag that starts with "side." and add it to the list of tags of the images dependency (with sort order 35). So if we already picked (by any means) a model that worked with 224*224 images, its tag side.224 would be passed to the image dataset and would ignore datasets with other image sizes.

  1. When working with environment entries, we heavily rely on them having tags. The environment entries are created either by installing a package (all of which have tags) or detected by a soft entry (all of which also have tags). So I think it's a fair assumption.

We probably even want to know about the ones that don't have tags, to diagnose a potential problem early :)

gfursin commented 4 years ago

Cool! Thanks for the clarification @ens-lg4 . Both answers makes total sense! I am merging the PR.