ndmitchell / build-shootout

Comparison of build program expressive power
88 stars 11 forks source link

Tup support #3

Closed drewm1980 closed 10 years ago

drewm1980 commented 10 years ago

Hello, I'm working on adding tup to the shootout!

Tup prevents you from adding a target that overwrites an existing file. This is a wonderful feature, as it prevents you from accidentally blowing away your source code, etc... if you make a typo while writing and testing your build configuration.

If you change the tooling to not already have created the output files, you can pull my branch (drewm1980/build-shootout) and it ~may work. I'd issue a pull request, but as it stands, this issue breaks the build.

Cheers, Andrew

ndmitchell commented 10 years ago

Awesome! That's great, I was hoping for tup support, as it's clearly got different expressive power to the other build systems on offer. I'll see what I can do to fix it up, then probably add an explicit test for the non-overwriting thing as well.

drewm1980 commented 10 years ago

Hello Neil!

This is fun :) If you're going to add a test case for every stupid thing that tup prevents you from doing to your code accidentally, this is going to be one epic shootout! Want me to contribute some of the more complex tup cases once the trivial one passes? I'm happy to help; just want to avoid work duplication.

Cheers, Andrew

On Fri, Nov 22, 2013 at 1:08 PM, Neil Mitchell notifications@github.comwrote:

Awesome! That's great, I was hoping for tup support, as it's clearly got different expressive power to the other build systems on offer. I'll see what I can do to fix it up, then probably add an explicit test for the non-overwriting thing as well.

— Reply to this email directly or view it on GitHubhttps://github.com/ndmitchell/build-shootout/issues/3#issuecomment-29068016 .

ndmitchell commented 10 years ago

I'm certainly interested in: 1) any case which has a different set of passing examples to every other one; 2) anything a user might reasonably care about. So something which only tup passes would be a worth addition, and a couple of safety features would be good, but not really sure just how small I want to get. I'll try and take a look at how to fix up the driver to support tup in a few hours, and then we can go from there.

drewm1980 commented 10 years ago

To test a lot of tup's safety features, you're going to need to add steps that mutate or replace the build configuration itself. Maybe have files basic-stage1-make or something. Or use more levels of subdirectories...

Any idea why the benchmark takes several seconds per test case? I would expect it to be instantaneous...

On Fri, Nov 22, 2013 at 2:14 PM, Neil Mitchell notifications@github.comwrote:

I'm certainly interested in: 1) any case which has a different set of passing examples to every other one; 2) anything a user might reasonably care about. So something which only tup passes would be a worth addition, and a couple of safety features would be good, but not really sure just how small I want to get. I'll try and take a look at how to fix up the driver to support tup in a few hours, and then we can go from there.

— Reply to this email directly or view it on GitHubhttps://github.com/ndmitchell/build-shootout/issues/3#issuecomment-29071367 .

drewm1980 commented 10 years ago

I pushed several untested tup examples. Do you want examples that ought to work, but are know to fail in your shootout? i.e. repeatable demonstrations of failure? i.e. I know spaces-tup fails, but maybe there is an escape sequence I just don't know yet.

It occurs to me that if you get too fancy with this, you'll be re-implementing a unit test framework; maybe it'd be better to just use an existing one?

On Fri, Nov 22, 2013 at 2:39 PM, Andrew Wagner drewm1980@gmail.com wrote:

To test a lot of tup's safety features, you're going to need to add steps that mutate or replace the build configuration itself. Maybe have files basic-stage1-make or something. Or use more levels of subdirectories...

Any idea why the benchmark takes several seconds per test case? I would expect it to be instantaneous...

On Fri, Nov 22, 2013 at 2:14 PM, Neil Mitchell notifications@github.comwrote:

I'm certainly interested in: 1) any case which has a different set of passing examples to every other one; 2) anything a user might reasonably care about. So something which only tup passes would be a worth addition, and a couple of safety features would be good, but not really sure just how small I want to get. I'll try and take a look at how to fix up the driver to support tup in a few hours, and then we can go from there.

— Reply to this email directly or view it on GitHubhttps://github.com/ndmitchell/build-shootout/issues/3#issuecomment-29071367 .

ndmitchell commented 10 years ago

The benchmark takes several seconds because I deliberately insert sleeps. On Windows the clock resolution is sometimes insufficient, so lots of fast build operations can go wrong. I might do some clever tricks to avoid that, but for the moment, I was keeping it simple. The build does support command line arguments, so you can do "runhaskell Main basic tup` to run only tests which are named basic and have system tup.

I usually find unit testing frameworks don't add too much benefit. I'll leave as is for now, and if it becomes useful, can switch later.

For failures, I currently just wasn't implementing the test. Perhaps we could put them in with .broken, where a failure of the test makes sense? No real idea yet.

ndmitchell commented 10 years ago

Your changes all look sensible, so I merged them via #4 , and I'll fix them up in this repo (it's a new project with few watchers, so I'm not too fussed about breaking the tests for a few hours).

drewm1980 commented 10 years ago

ok, great! I just added a couple more. One positive fallout of this shootout is that several systems might go ahead and implement low hanging features, like .PHONY annotations.

For broken tests, I think you still want them there; and to generate a feature matrix, with "Unsupported", "Supported", and "Unknown". Someone might get annoyed if you have a failing test that declares something unsupported, but I say the burden of proof is on them as long as you're willing to pull their changes! These are not exactly time consuming build files to write.

On Fri, Nov 22, 2013 at 5:43 PM, Neil Mitchell notifications@github.comwrote:

Your changes all look sensible, so I merged them via #4https://github.com/ndmitchell/build-shootout/pull/4, and I'll fix them up in this repo (it's a new project with few watchers, so I'm not too fussed about breaking the tests for a few hours).

— Reply to this email directly or view it on GitHubhttps://github.com/ndmitchell/build-shootout/issues/3#issuecomment-29087787 .

ndmitchell commented 10 years ago

I think I also managed to fix up a bit of the problem. Each build starts with no output files (as you say, it would be dangerous to do so), but you were wiping the .tup directory between builds. That needs to stay. With that done, at least basic passes (I'm off to the pub now, but will check a few more when I get back).

The other advantage to this shootout is you can see small illustrative examples of each build system, for learning. I agree more feature matrix is needed, I was starting small, and really hoping to link to a stack overflow question for every failing test, certainly for well known systems like make.

drewm1980 commented 10 years ago

hmm... tup warns you if you write to a hidden file. If you do that, you probably screwed up a variable expansion and wrote a file that is just the extension.

And if you fix that, it'll probably complain because two rules are writing to the same file. Tup will prevent you from writing the race condition you're using to detect parallelism!

Best to move the file out of the tup hierarchy... maybe add a couple more test cases.

On Fri, Nov 22, 2013 at 6:06 PM, Neil Mitchell notifications@github.comwrote:

I think I also managed to fix up a bit of the problem. Each build starts with no output files (as you say, it would be dangerous to do so), but you were wiping the .tup directory between builds. That needs to stay. With that done, at least basic passes (I'm off to the pub now, but will check a few more when I get back).

The other advantage to this shootout is you can see small illustrative examples of each build system, for learning. I agree more feature matrix is needed, I was starting small, and really hoping to link to a stack overflow question for every failing test, certainly for well known systems like make.

— Reply to this email directly or view it on GitHubhttps://github.com/ndmitchell/build-shootout/issues/3#issuecomment-29090034 .

ndmitchell commented 10 years ago

I think I fixed the driver, now monad2 is failing for me, which I believe is a bug in the tup rules rather than the driver.

ndmitchell commented 10 years ago

If you have any idea how to install tup on Travis, let me know. I get "Package fuse was not found in the pkg-config search path" when I do the obvious thing.

ndmitchell commented 10 years ago

I got further, but still not to the end - errors at https://travis-ci.org/ndmitchell/build-shootout/

drewm1980 commented 10 years ago

I had to google to find out what travis is! I would be very surprised/impressed if it lets code hook into the filesystem to the degree tup expects. I'm looking into the tup errors... it looks like we may be exposing tup bugs (or, well, suboptimalities) now.

On Fri, Nov 22, 2013 at 11:38 PM, Neil Mitchell notifications@github.comwrote:

I got further, but still not to the end - errors at https://travis-ci.org/ndmitchell/build-shootout/

— Reply to this email directly or view it on GitHubhttps://github.com/ndmitchell/build-shootout/issues/3#issuecomment-29114949 .

drewm1980 commented 10 years ago

I think the code will easier to debug if each test is run in a new directory. That way, we don't have to clean up whatever bizzare state is left behind by a test in the temp directory...

I got these working:

basic tup ... Success

include tup ... Success

wildcard tup ... Success

monad1 tup ... Success

monad2 tup ... Success

parallel and monad3 and unchanged are having infrastructure related problems, I think...

On Sat, Nov 23, 2013 at 8:32 AM, Andrew Wagner drewm1980@gmail.com wrote:

I had to google to find out what travis is! I would be very surprised/impressed if it lets code hook into the filesystem to the degree tup expects. I'm looking into the tup errors... it looks like we may be exposing tup bugs (or, well, suboptimalities) now.

On Fri, Nov 22, 2013 at 11:38 PM, Neil Mitchell notifications@github.comwrote:

I got further, but still not to the end - errors at https://travis-ci.org/ndmitchell/build-shootout/

— Reply to this email directly or view it on GitHubhttps://github.com/ndmitchell/build-shootout/issues/3#issuecomment-29114949 .

ndmitchell commented 10 years ago

I merged over everything you have done so far. I agree that using a separate directory per run might make sense. However, I completely wipe the temp directory at the end of each build test, so whatever you are seeing, it isn't due to stale files being left behind. I also backed out the removeFile calls in Main - if you are running normally they have absolutely no effect as I wipe everything at the end. If you fail then I deliberately leave the files around so you can test next time (and on failure the removeFile wouldn't run anyway). Did you see anything change after adding the removeFile?

I can't test tup on Travis because of https://github.com/travis-ci/travis-ci/issues/1100, I don't believe that is fixable, which is a shame. I find Travis to be an excellent system, automatic continuous integration tests, for free, with a high degree of power over what you run. I use it for all my projects.

Regarding the ones you think might be infrastructure related:

I suspect I'll improve the infra so I explicitly support tests marked broken, so things like unchanged would have implementations for all languages, they just wouldn't all obey the rules.

drewm1980 commented 10 years ago

On Sat, Nov 23, 2013 at 10:32 PM, Neil Mitchell notifications@github.comwrote:

I also backed out the removeFile calls in Main - if you are running normally they have absolutely no effect as I wipe everything at the end. If you fail then I deliberately leave the files around so you can test next time (and on failure the removeFile wouldn't run anyway). Did you see anything change after adding the removeFile?

yeah... I wasn't sure whether or not to commit those. I think it was making a difference in one of the monad tests, but I don't remember the details at the moment.

Regarding the ones you think might be infrastructure related:

  • parallel works fine for me. I'm on Windows, what error are you seeing?
  • monad3 isn't in the repo
  • unchanged had 3 separate issues, which I've fixed. After all that, I get a failure that it ran the rule even though the input file didn't change (this appears as run being in the file one more time than expected). Does tup support unchanging files? e.g. if you A depends on B, and C depends on B, and A changes, so you rebuild B but discover it didn't get written to, does dup then skip rebuilding C? Ninja (with restat) and Shake (by default) do, but Make doesn't. Unless its opt in, it looks to me like tup doesn't.

I suspect I'll improve the infra so I explicitly support tests marked broken, so things like unchanged would have implementations for all languages, they just wouldn't all obey the rules.

— Reply to this email directly or view it on GitHubhttps://github.com/ndmitchell/build-shootout/issues/3#issuecomment-29142159 .

Here's a run of the repo with parallel tup removed for me on ubuntu 12.04:

$ ./Main.hs tup

basic tup ... Success

include tup ... Success

wildcard tup ... Success

monad1 tup ... Success

monad2 tup ... Success

unchanged tup ... Success

multiple tup ... * 100% 1) [0.002s] .

tup error: Attempting to insert 'source1' as a generated node when it already exists as a different type (normal file). You can do one of two things to fix this: 1) If this file is really supposed to be created from the command, delete the file from the filesystem and try again. 2) Change your rule in the Tupfile so you aren't trying to overwrite the file. tup error: Error parsing Tupfile line 1 Line was: ':|> multiple-gen input -- %o |> source1 source 2' *\ tup: 1 job failed. Main.hs: System command failed: tup > /dev/null

The one with multiple is the one I think might be infrastructure related.... it might just need one removeFile in that one test to fix... don't have time to look at it at the moment... catching a train.

With parallel tup I get:

$ ./Main.hs tup parallel

parallel tup ... Main.hs: File is wrong: .log

Expected: "start\nstart\nend\nend\n" Got : "start\nend\n"

I've e-mailed the tup-users to find out if running serially on the first build to detect dependency problems is expected behavior, or if it should be parallel the first time. It might also have to do with whether tup init is called before or after the inputs are created...

ndmitchell commented 10 years ago

I can easily imagine multiple is infra related, i will take a look in 5 hours or so. I will look at parallel too, if the first build was serial I would expect start/end/start/end, not just start/end.

ndmitchell commented 10 years ago

Looking at parallel, start/end is definitely wrong. After running the parallel test, do you see both output1 and output2? It looks like only one rule is being run. On Windows with the released tup it works fine.

I looked into multiple. The infra was suffering from the same errors as unchanged, so I fixed all that up. I then added the necessary sh calls to the tup script. After all that, running it fails, with errors suggesting the dependencies aren't all specified. After changing that, I get errors about files being created that are source files. I've left it was you wrote, but the "obvious" things didn't seem to work. From what I can read, tup doesn't support multiple outputs? That would match what I was seeing.

drewm1980 commented 10 years ago

Does windows default to opening files in append mode? I just assumed the second rule was overwriting the file since it was serial, but maybe something else is going on.

Well... tup most certainly supports multiple outputs in the sense of one command generating two outputs; I do that all the time. It's possible you caught some way in which it's inefficient... will look into it...

On Sun, Nov 24, 2013 at 8:25 PM, Neil Mitchell notifications@github.comwrote:

Looking at parallel, start/end is definitely wrong. After running the parallel test, do you see both output1 and output2? It looks like only one rule is being run. On Windows with the released tup it works fine.

I looked into multiple. The infra was suffering from the same errors as unchanged, so I fixed all that up. I then added the necessary sh calls to the tup script. After all that, running it fails, with errors suggesting the dependencies aren't all specified. After changing that, I get errors about files being created that are source files. I've left it was you wrote, but the "obvious" things didn't seem to work. From what I can read, tup doesn't support multiple outputs? That would match what I was seeing.

— Reply to this email directly or view it on GitHubhttps://github.com/ndmitchell/build-shootout/issues/3#issuecomment-29163490 .

ndmitchell commented 10 years ago

It's nothing to do with append mode. For parallel, I generate two files, both in standard write mode, but secretly off to the side I use echo start >> .log to trace when they start/stop.

Tup might support multiple outputs, but does it support a single rule that generates multiple outputs, where both outputs are independently tracked as proper dependencies? It might, but it's a step beyond just multiple outputs. That said, most of the errors I was getting seemed to be close to syntax errors, so it might support it just not the way I was expressing it.

drewm1980 commented 10 years ago

I think suppressing warnings with tup is a bad idea; it will prevent writing a lot of tests where a badly specified build is ~supposed to fail. (and tup shines)

The problem with parallel seems to still be that you're writing to a .log file that tup knows about; if you want to implement a race condition that depends on whether tup does something in parallel or not, it will be easier to do outside the view of tup, i.e. move .log outside of tup's file tree. Keep in mind, if you have to jump through hoops to make a file that depends on whether tup decided to build in parallel or not, that's a feature, not a bug. I could see a future version of tup warning you when your build messes with files that aren't even in the tree, so maybe the threads need to do some IPC with each other. Of course if network devices are files in unix and future-tup sees them getting accessed as files... you could really go to extremes with this!

$ ./Main.hs tup parallel

parallel tup ... * tup errors *

tup warning: Writing to hidden file '/home/awagner/build-shootout/temp/.log' * tup errors * tup warning: Writing to hidden file '/home/awagner/build-shootout/temp/.log' tup warning: Update resulted in 2 warnings Main.hs: File is wrong: .log Expected: "start\nstart\nend\nend\n" Got : "start\nend\n"

Anyway, the test runs in 5 seconds and the outputs are correct.

On Sun, Nov 24, 2013 at 4:15 PM, Neil Mitchell notifications@github.comwrote:

I can easily imagine multiple is infra related, i will take a look in 5 hours or so. I will look at parallel too, if the first build was serial I would expect start/end/start/end, not just start/end.

— Reply to this email directly or view it on GitHubhttps://github.com/ndmitchell/build-shootout/issues/3#issuecomment-29157456 .

ndmitchell commented 10 years ago

Sounds like a tup on Linux/FUSE bug to me? It seems to have dropped some of the updates I'd have expected to get through. On Windows/tup it works fine.

That said, I agree this is definitely and deliberately bypassing the property of build systems - but that's because I have to do so to measure what is really going on. If moving these log files into a different directory works, that seems a reasonable thing to do.

drewm1980 commented 10 years ago

On Mon, Nov 25, 2013 at 12:49 PM, Neil Mitchell notifications@github.comwrote:

Sounds like a tup on Linux/FUSE bug to me? It seems to have dropped some of the updates I'd have expected to get through. On Windows/tup it works fine.

The bug is the other way around; the windows version is failing to prevent you from writing a file-based race condition, while the linux is succeeding at preventing you from doing that :) The linux client is more mature, although some features and performance characteristics are depending on what filesystem mechanisms are used. The linux version isn't perfect yet (there are issues with absolute paths and fuse, for instance).

Let me know when you are writing the .log file in a different place, and I'll rerun it; I bet that will solve the problem. I tried just replacing the path with ../.log, but haskell complained about it... so I'd fix it, but I'm totally baffled by some of the haskell...

run :: String -> Tool -> [Opt] -> IO () run name tool opts = do xs <- mapM (opt tool) opts threadDelay 1000000 let p = last $ 1 : [i | Parallel i <- opts] ...

wildcard :: ([Opt] -> IO ()) -> IO () wildcard run = do x <- randomRIO (1::Int,1000000) let name = "name" ++ show x writeFile (name ++ ".in") "abc" run [Target $ name ++ ".out", Contents (name ++ ".out") "abc"] run [Target $ name ++ ".out", NoChange] writeFile (name ++ ".in") "xyz" run [Target $ name ++ ".out", Contents (name ++ ".out") "xyz"] run [Target $ name ++ ".out", NoChange] ...

It seems like run is a global function, but then you bind it to something local in wildcard, but then when you call it it has the global interface... I think I can figure it out if I stare at it longer... maybe the local "run" is a partially specialized version of the global run... but wildcard isn't just partially specializing the function cause it returns an IO... Once you really grok haskell, does code like this become more legible?

ndmitchell commented 10 years ago

It seems weird for the Linux version to change "AABB" into "AB" as preventing a race condition. Had it reordered the writes to "ABAB" that would have seemed like a legitimate fix of the problem. That said, it's clearly outside the build system, so I'm happy enough to change it.

The change to use ../.log involved barely any Haskell, far more shell script - see the diff I committed. In general I tried to write Main.hs to be the simplest and clearest declarative code which everyone can work with, while Util.hs is the driver, so more advanced. I was hoping most people other than me wouldn't have to go inside Util.hs.

Yes, it's quite legible to me :) - and I think after a while it clicks, although shadowing variable names is something some advanced users still object to. There is a global function run which takes 4 arguments. In test I partially apply run to 2 arguments (line 69), producing a function of 2 remaining arguments, which I pass as the first argument to wildcard. I happen to call this argument run as well, but don't have to. Morally, everything called run does the same thing, so if you are taking the 40 foot view, just think of wildcard calling the global run. However, as you zoom in, you see that certain arguments got filled in earlier, and rather than think too hard I just rely on the type system to check those pieces get threaded through properly.

ndmitchell commented 10 years ago

I've changed all tests so that all log files are in ../.log, which hopefully fixes tup. It still fails the multiple test, but now I think it's failing properly - if you don't pass unchanging I don't think you can pass multiple. I'm not sure what benefit you get from having multiple outputs without unchanging files, other than mild syntactic convenience, which I deliberately don't want to test for.

ndmitchell commented 10 years ago

Good news, I fixed multiple. Required plumbing support for targets through to tup (so you can tup output1 to only build output1). I also changed to avoid all .tmp intermediate files.

Now running runhaskell Main tup works for me :)

drewm1980 commented 10 years ago

Yaaay! Cool!

$ Main.hs tup

basic tup ... Success

parallel tup ... Success

include tup ... Success

wildcard tup ... Success

monad1 tup ... Success

monad2 tup ... Success

unchanged tup ... Success

multiple tup ... Success

The new lua parser may enable monad3 and spaces... maybe I'll tinker with that next... or add a test or two that shows off tup preventing stupid mistakes.

On Tue, Nov 26, 2013 at 7:29 PM, Neil Mitchell notifications@github.comwrote:

Good news, I fixed multiple. Required plumbing support for targets through to tup (so you can tup output1 to only build output1). I also changed to avoid all .tmp intermediate files.

Now running runhaskell Main tup works for me :)

— Reply to this email directly or view it on GitHubhttps://github.com/ndmitchell/build-shootout/issues/3#issuecomment-29318186 .

ndmitchell commented 10 years ago

I certainly believe that a different parser would fix the spaces issue. However, I believe monad3 is showing fundamentally new build system power, so would be slightly surprised if the basic mechanisms of tup support it but the syntax doesn't - only one way to find out.

ndmitchell commented 10 years ago

I think it's probably time to close this bug. The tup support isn't complete, but it's not clear if the remaining cases are places tup can't work, or we just haven't figured it out - it's not clear, but I'll still happily accept pull requests for them. I still welcome new tests that show the extra safety provided by tup.