hakaru-dev / hakaru

A probabilistic programming language
BSD 3-Clause "New" or "Revised" License
309 stars 30 forks source link

Ensure update-archive is testable on all platforms #66

Open yuriy0 opened 7 years ago

yuriy0 commented 7 years ago

Ken said:

I need to be able to test update-archive on Windows, so I have made a pending request for a demo of update-archive on Windows

This should be exactly the same as on *nix (provided one has make, and a shell resembling bash, neither of which are terrible difficult to obtain on Windows; personally I use cygwin for both); in particular, there are three modes in which the archive can be (re)built:

  1. From a fresh environment: rm ppaml.mla && make ppaml.mla

  2. Updating from the command line: make ppaml.mla (assuming that ppaml.mla exists, which can be ensured by running the above)

  3. Updating from within Maple: maple -q -c "(proc() Hakaru:-UpdateArchive(); quit; end proc)()" (assuming that ppaml.mla exists)

All of the above should produce no output which doesn't indicate itself as a warning.

The first two must be run from the hakaru/maple directory; the last can be run from anywhere, assuming the mapleinit file properly points to the existing archive.


Another issue raised by Ken (and a more important one):

I don't know how to test whether I have preserved 'a clean "build" process that worked both for bootstrap and for incremental updates, on all 3 platforms'

The only way to reliably test "success" or "failure" of this operation is to run all of the test cases (perhaps this is an acceptable test - we should be running all the tests often anyways?); or at least a sufficient number of them to hit all the functions present in Hakaru. If there is a build error, that should be indicated by an actual error while running UpdateArchive; but it is possible that UpdateArchive could itself do the wrong thing, without ever erroring (for example, by omitting a new symbol from the list of symbols saved in the archive).

Potentially, there is a single test case which 'hits' all the functions in Hakaru; and I'm fairly confident that d3 in DisintT is such a test case. But I think that requiring that disint be run to test whether the archive was built successfully is too strong a requirement (it is very unlikely that disint, and only disint, were not built properly). disint may change rapidly and without warning, which could break such a test. Perhaps there should be a separate test file, which contains the input to improve from d3 - that is:

Bind(Bind(Uniform(0, 1), x, Bind(Uniform(0, 1), y, Ret(Pair(x-y, f(x, y))))), p, piecewise(Hakaru:-fst(p) <= t, Ret(Hakaru:-snd(p)), Msum()))

and the test consists of ensuring that fromLO(improve(toLO(<expr>))) does not produce an error.


Another potential difference between Windows and nix is, on Windows, when the archive already exists; on nix it would be deleted, but on Windows the existing archive file is emptied to produce a new empty archive. Which of these actually happens should not make a difference. A potential way to test that the previous statement is actually true is to remove the logic which deletes the archive file, and just keep the logic which clobbers each entry in the archive individually.