Octachron / codept

Contextual Ocaml DEPendencies Tool: alternative ocaml dependency analyzer
Other
59 stars 10 forks source link
dependency-analysis ocaml

Codept intends to be a dependency solver for OCaml projects and an alternative to ocamldep. Compared to ocamldep, codept major features are:

Both ocamldep and codept compute an over-approximation of the dependency graph of OCaml projects. However, codept uses whole project analysis to reduce the number of fictitious dependencies inferred at the project scale, whereas ocamldep is, by design, limited to local file analysis.

Consequently, bugs notwithstanding, codept computes an exact dependency graph in any situation that does not involve unknown external modules or first class modules, and is still reliable in some standard use cases of first class modules (see this riddle as an illustration of why first class modules can be problematic).

Moreover, codept will emit warning messages any time it encounters a source of potential inaccuracies in the dependency graph. In other words, if no warnings are emitted by codept, the computed dependencies are ensured to be exact.

Another important point is that codept's whole project analysis feature makes it possible to handle uniformly the delayed dependency aspect of module aliases introduced by the compiler -no-alias-deps option.

Nested module hierarchy is also fully supported by codept starting with version 0.10. In particular, when the -nested option is enabled, a file with path dir_1/…/dir_N/a.ml will be mapped to the module Dir_1. … . Dir_n. A instead of just A.

At last, in situation where dependencies up to transitive closure are not precise enough, codept's experimental -expand-deps option can track more precisely type aliases induced dependencies, making it easier to track all cmi files required to compile a given file for instance.

Basic performance measures indicate that the average time increase when compared to ocamldepranges between 10% to 50%.

Cycle detection

As an illustration of codept error messages, in presence of a cycle in the dependency graph, codept will emit a warning message detailing both the cycle and where the cycles was introduced in the source tree.

For instance, on the following set of four files a.ml, b.ml, c.ml and d.ml

(* a.ml *)
open B
(* b.ml *)
2;;
open C
(* c.ml *)
2;
3;;
open D
(* d.ml *)
2;
3;
4;;
open A

codept yields:

$ codept a.ml b.ml c.ml d.ml
[Fatal error]: Solver failure
   −Circular dependencies:
      A −(a.ml:l2.5−6)⟶
      B −(b.ml:l3.5−6)⟶
      C −(c.ml:l4.5−6)⟶
      D −(d.ml:l5.5−6)⟶ A

By default, this error is a fatal error and codept stops here. When prototyping, it can be useful to ignore this setback and compute approximate dependencies even in presence of cycle. This can be achieved using the -k option that sets up codept to ignore any somehow recoverable errors. Note that the cycle approximation is not yet specified ans is especially unwieldy combined with the -no-alias-deps option.

First class module limitations

In presence of first class modules, codept may infer fictitious dependencies. More precisely, if a first class module whose signature cannot be locally inferred at the point of binding is opened or included, then subsequent access to submodules of this first class module will generate fictitious dependencies. For instance, for a file a.ml such as

(* a.ml *)
module type S = sig module B: sig end end
let f (x : (module S)) = x
let g x =
  let (module M) = f x in
  let open M in
  let open B in
  ()

codept a.ml generates a fictitious B dependency and one warning

[Warning]: a.ml:l6.11−12,
  First-class module M was opened while its signature was unknown.

Codept overview

In more details, codept works by combining together three main ingredients:

Currently, these three elements are then used in one of the two basic solvers implemented (see Solver).

Given a list of ".ml" and ".mli" files and a starting environment, the default solver iterates over the list of unresolved files and try to compute their signature. If the computation is successful, the resulting signature is added to the current environment and the current file is removed from the list of unresolved files. Otherwise the solver continues its iteration.

The directed solved start with a list of leaf modules and then trace back the ancestors of those leaf modules.

Cycles and non-resolvable dependencies are detected when the solver does not make any progress after one cycle of iterations over unresolved files.

Usage

Codept can be used as a drop-in replacement for ocamldep, on Linux at least. More tests are needed on other platforms. Unfortunately, most of OCaml build systems are built around ocamldep limitations and would not benefit directly from replacing ocamldep by codept.

A possible exception is self-cycle detection: invoking codept on the following "a.ml" file

(* a.ml *)
open A

yields directly a circular dependency error:

[Fatal error]: Solver failure
  −Circular dependencies:  A −(a.ml:l2.5−6)⟶ A

See the integration section for a better overview on how to use codept with ocaml build tools.

Compatibility with ocamldep

Most of the ocamldep options are also supported by codept.

However, some of the ocamldep options are slightly reinterpreted:

Another possible difference between codept and ocamldep output is codept built-in detection of dependency cycles. With default options, cycles triggers a fatal error message within codept and stops the current analysis. If the option -k is specified, the analysis try to go on by ignoring the submodule structure of cycle when inside the cycle.

Codept-only options

Structured output

Some new options enables to serialize files in s-expression format for ulterior uses by codept:

Solver and outliner options

Some new options modify the behavior of either the solver or the outliner used by codept

Findlib options

All options available with ocamlfind ocamldep are also available for codept, in particular:

Miscellaneous output

Other new options explore codept possibilities and intermediary representations

Warning and error messages

Warning and error messages, referred together as fault messages can be controled extensively:

For a more exhaustive list of options, see the codept's man page.

Installation

For the dev version: opam pin add codept https://github.com/Octachron/codept.git

Version 0.9 is currently avaliable on opam.

Integration with build tools

Like ocamldep, codept can be used to generate .depends file for integration with a makefile based infrastructure.

Combining codept with ocamlbuild currently requires more efforts. A library is provided by thecodept.ocamlbuild subpackage to ease such integration. Depending on the constraints, codept can be enabled by writing the following simple ocamlbuild file

(*myocamlbuild.ml*)
Codept_ocamlbuild.dispatch ()

and adding the -plugin-tag "package(codept.ocamlbuild)" option to ocamlbuild command line:

ocamlbuild -plugin-tag "package(codept.ocamlbuild)"  …

Better integration with existing tools is still a work in progress.

Comparison between ocamldep and codept

To precise the difference between ocamldep and codept, ocamldep introduces fictional dependencies when opening foreign submodules within a given unit. For instance, if we have two files, a.ml

(* a.ml *)
open B
open C
open D

and b.ml

(* b.ml *)
module C = struct end
module D = struct end

ocamldep -modules a.ml b.ml produces

a.ml: B C D
b.ml:

In this output, the dependencies C and D for the unit A are fictitious. Contrarily, codept -modules a.ml b.ml uses all the information available in both file at once and produces the exact dependency graph in this situation.

a.ml: B
b.ml:

Notwithstanding bugs, codept should compute exact dependency graphs when no first class modules are opened or included. In the presence of opened or included first class modules, codept will still compute the exact dependency graph if the signature of the module is present in the pattern binding the module name, for instance with

(* a.ml *)
module type S = sig module C:sig end end
(* b.ml *)
open A
let f (module M:S) =
  let open M in
  let open C in
  ()

codept -modules a.ml b.ml produces

a.ml:
b.ml: A

as expected.

However, if the inference of the first class module signature is more involved, codept will produce an inexact dependency graph:

(* a.ml *)
module type S = sig module B: sig end end
let f (x : (module S)) = x
let g x =
  let (module M) = f x in
  let open M in
  let open B in
  ()

gives with codept -modules a.ml a fictitious B dependency

a.ml: B

and emits a warning (and a notification)

[Warning]: a.ml:l6.11−12,
first-class module M was opened while its signature was unknown.
[Notification]: a.ml:l7.11−12,
a non-resolvable module, B, has been replaced by an approximation

To avoid this situation, a possible fix is to add back a signature annotation:

(* a.ml *)
module type S = sig module B: sig end end
let f (x : (module S)) = x
let g x =
  let (module M: S) = f x in
  let open M in
  let open B in
  ()