snowleopard / build

Build Systems à la Carte
MIT License
242 stars 18 forks source link

[WIP] A more realistic make mockup #3

Closed layus closed 6 years ago

layus commented 6 years ago

With this implementation of make makeT, tasks have control over the timestamp of the outputs. With the former implementation (make), they had no control over it.

This implementation allows to reproduce a forged case where make is not minimal. Running it twice makes it rebuild already valid outputs.

It is also better because it is a Timed v => Build () k v build system. The signature shows that make relies on values' own timestamps, and that make is stateless.

Not sure however if these technicalities would improve the paper. It would certainly make it more complex if the same updates are added to the other build systems.

snowleopard commented 6 years ago

@layus Thank you for the experiment!

As you can see, we've just made a major refactoring of build systems using a new abstraction Strategy which allowed us to describe all build systems uniformly as a combination of an algorithm and a strategy.

Regarding the Timed v => approach: we also considered this option, but eventually thought that it wasn't reflecting the reality. It is not really a property of values that they can be timed. It is the property of a store. With out definition of Store i k v it makes sense to encode this as something like TimedValues i =>, which is essentially what we do in the paper, but without the type class -- we simply include the modTime function into the build information i.

With this implementation of make makeT, tasks have control over the timestamp of the outputs. With the former implementation (make), they had no control over it.

I don't understand this comment: make does have full control over it -- it writes timestamps directly into i.

This implementation allows to reproduce a forged case where make is not minimal. Running it twice makes it rebuild already valid outputs.

We can also model this: simply modify the i in a Store i k v between two builds -- this will fool make in the same manner as in your implementation. Or am I missing something?

layus commented 6 years ago

As you can see, we've just made a major refactoring of build systems using a new abstraction Strategy which allowed us to describe all build systems uniformly as a combination of an algorithm and a strategy.

Yes indeed. This is not really something to be merged. See below.

Regarding the Timed v => approach: we also considered this option, but eventually thought that it wasn't reflecting the reality. It is not really a property of values that they can be timed. It is the property of a store. With out definition of Store i k v it makes sense to encode this as something like TimedValues i =>, which is essentially what we do in the paper, but without the type class -- we simply include the modTime function into the build information i.

In real filesystem implementations, it is a property of the values that they can be timed, as a task has full control over the timestamp of its outputs. For example with the touch -t command. But it is indeed not how make should be used. It was designed with the assumption that modification times reflect build times and are not tampered with.

With this implementation of make makeT, tasks have control over the timestamp of the outputs. With the former implementation (make), they had no control over it.

I don't understand this comment: make does have full control over it -- it writes timestamps directly into i.

Yes, make does, but not its tasks.

It is a question of separation of concerns. With make, it is impossible for a task to read, write or otherwise interact with i. A make task is a Task Applicative k v, with no access toi.

The curious oddity about make is that it uses as internal state values that are alterable by the build tasks themselves. Of course, altering the timestamps of files produced in make tasks is a bad practice. Modeling make as you did simply assumes that tracking modification times in the OS is part of make, and not part of the task.

In this Makefile example, the tasks output a file with a fixed timestamp. This is not possible in make, because the timestamp is not visible by the the task. There is no way to implement a Task Applicative k v that chooses its own output timestamp.

# Disable builtin rules for a cleaner `make -d`
MAKEFLAGS += --no-builtin-rules

# touch -t [[CC]YY]MMDDhhmm[.ss] <file>
MMDDhh = 010101

.PHONY: all
all: y z

x:
    touch x -t $(MMDDhh)03
y: x
    touch y -t $(MMDDhh)02
z: x
    touch z -t $(MMDDhh)01

See how the y and ztasks generate outputs that are outdated (older than x) even though they are generated after x ? It is not possible to model this with make, only with makeT.

With makeT, I could implement this with the following code, where the various tasks output a timestamp, and that timestamp is used by make to prune up-to-date tasks. Because the tasks do no respect the monotonicity assumption, makeT rebuilds them uselessly.

taskMakeTrick :: Task Applicative String Time
taskMakeTrick get k = case k of
    "x" -> Just $ (const 3) <$> phony []
    "y" -> Just $ (const 2) <$> phony ["x"]
    "z" -> Just $ (const 1) <$> phony ["x"]
    "all" -> Just $ phony ["y", "z"]
    _   -> Nothing
  where
    phony = foldl (liftA2 seq) (pure (-1)) . map get

From this model, we can see that

make is

  1. minimal,
  2. correct.

makeT is

  1. not minimal (example above),
  2. not correct (an example could be crafted).

The only difference between these two lies in the fact that make internal state is in fact the values themselves. If you refrain from touching timestamps within or outside of gnu make, and if you are on a filesystem that observes monotonic timestamps then you can consider gnu make as a correct and minimal build system. Otherwise, you must admit that gnu make is neither.


Now, I think you model is better, as it represents make as it is intended to work, based on the assumption that timestamps are not altered by the tasks.

This discussion made me see how attached I was to make a robust build system, with strong guarantees that it cannot be tricked or inadvertently

I wanted to play with the idea that make is stateless, and represent that with a Build Applicative () k v. This PR is not intended to be merged, just to play with the idea of different modelling choices.

snowleopard commented 6 years ago

Yes, make does, but not its tasks.

Ah, I see what you mean now. Indeed, we cannot model a touch task as a Task Applicative k v.

Interestingly, the newly introduced Task (MonadState i) k v can model touch. But at the cost of becoming monadic. I wonder if there is an applicative version which could be used to model touch a bit more faithfully.