amperity / lein-monolith

Leiningen plugin for working with monorepos.
Other
214 stars 18 forks source link

Topological sort info by tiers #86

Open biancazzurri opened 3 years ago

biancazzurri commented 3 years ago

Hello and thank you for your work :)

Question: I want to be able to query tiers of topology, e.g if we have this graph of dependencies: projects A and B are libs and C depends on A and B i would like to receive a structure of sort: [[A, B],[C]] i.e. in every item vector the code is independent I need this information in order to make it work with parallel api in CircleCI

greglook commented 3 years ago

Could you describe how you'd use that to parallelize the projects in CircleCI? At a guess, you want to run the logically-independent projects in separate containers, but it seems like the overhead of managing all of the caches between those runs would dominate the time to actually build or test the projects.

What we do at Amperity is to have one job that does a parallelized install (using the lein monolith each :parallel X option) to build a Maven cache with all our projects, then have N parallel jobs that use the selector functionality to distribute the tests across the containers:

(defn mod-selector
  "Given a pair of environment variable names, parse them to determine the
  current and total nodes to distribute the projects over. Returns a function
  suitable for use as a `:project-selector`. Throws an exception if the
  variables are not present in the environment."
  [base index-var total-var]
  (let [index (System/getenv index-var)
        total (System/getenv total-var)]
    (if (and index total)
      ;; Distribute projects across nodes.
      (let [i (Integer/parseInt index)
            n (Integer/parseInt total)]
        (fn selected?
          [project]
          (= (- i base) (rem (:monolith/index project) n))))
      ;; Missing vars, print warning and accept everything.
      (do
        (binding [*out* *err*]
          (println "WARN: Cannot select projects without environment vars"
                   index-var "and" total-var))
        (constantly true)))))
:monolith
{:project-selectors
 {,,,

  ;; This selector picks a subset of projects which modulo to the current
  ;; container index in CircleCI.
  :circle-ci
  (do (load-file "util/test/selectors.clj")
      (amperity.monolith.selectors/mod-selector
        0 "CIRCLE_NODE_INDEX" "CIRCLE_NODE_TOTAL"))}}
biancazzurri commented 3 years ago

i am thinking about something like described here: https://circleci.com/blog/how-bolt-optimized-their-ci-pipeline-to-reduce-their-test-run-time-by-over-3x/?utm_medium=SEM&utm_source=gnb&utm_campaign=SEM-gb-DSA-Eng-emea&utm_content=&utm_term=dynamicSearch-&gclid=Cj0KCQiA1KiBBhCcARIsAPWqoSrClC3Z1Nj5ETj7-kN-1prwH1yJIW9UiVG6YnrZwB-NB1szx0r5iaQaAgodEALw_wcB

so for every project in monorepo i would have a boolean parameter, which can be changed by checking what is changed by monolith plugin and the parallelise it. So maximum parallelism that i can achieve here is maximum count of independent project on a given level of dependency tree