Test correctness - Githubissues

giovanni-rosotti commented 8 years ago

Only one test is implemented for now (soundwave), but before I go and implement other ones I'd like to get feedback on how I've set it up. MPI in particular was difficult. The solution I came up with is to spawn separate MPI processes to call the actual gandalf executable and then read the output. That little bit of code can easily become a function somewhere in order to be reused. The alternative of running several copies of the python script would have been cumbersome, first because the script is invoked through nose (this could be solved through spawning as well, but we need an executable to run, i.e. a custom MPI runner script), and second because it would require the master process in python to aggregate all the results from the other processes, which to me feels even more ad hoc (or should we do it on the c++ side? We could extend the CopyDataFromSimulation function to copy the data from the other processors). I'm open to suggestions though.

Note also that I removed the openMP tests with clang from travis, so now the builds should be faster (6 versions of the code instead than 8).

If you are wondering what is going on in the Travis file, travis does not allow us to use the system python for reasons beyond my understanding. However we need packages like numpy, which are not provided on travis and would take a long time to install with pip (we cannot use apt-get because that is for the system python). The easiest solution was to install anaconda...

rbooth200 commented 8 years ago

I can't claim to understand all of these changes, but it looks to me that the approach is reasonable. I do have a couple of suggestions though.

Firstly, the 'correct' way to solve this problem would be to properly support MPI through python, but I appreciate that this might be too much work for this release. Instead, I'd suggest we stick with the approach that you have done already (assuming that it works properly). In this case, I think that code you have written is far too useful to be sitting in the soundwave test.

Can you make a more general base class for running simulations in `detached mode'? The simulation could then be 'reattached' in the way you do it already for later reading of the results. I'd suggest that this class should be part of the main python package. I think this would be another case where the template method pattern would be really useful - we could provide a sensible default set of methods that we might want to replace in the tests.

Thoughts?

giovanni-rosotti commented 8 years ago

Yes, what I've done now works correctly. What I don't like is that we have to go through disc I/O. Because we are using MPI, we could actually use MPI itself to aggregate the results on the process running python.

I agree that this should become part of the main python package. It should happen transparently as far as the user is concerned when the function "run" is called. Behind the hood, this can be accomplished with the class "detached mode" you mentioned. I'm not exactly sure how you want to use the template method pattern here, but we can discuss that next week. I'd say that this needs to be done properly now. Avoiding disc I/O is less important.

rbooth200 commented 8 years ago

I agree with you that this is less than ideal, especially the disc IO part of it, but I don't see a way of doing it without major changes to the code, so it should be fine for now. However, the ability to use python to set off a bunch of simulations running in the background, from which you can collect the results later seems like a really good feature.

Is it possible to use run to set off multiple simulations simultaneously without MPI? This might be wanted when you have lots of small jobs. If not then I think the 'detached mode' should call the binary in a similar way to the MPI calls.

The template method pattern may not be needed. But my reasoninig as to why it might be useful is that one could imagine wanting override the default setup behaviour, for example.

rbooth200 commented 8 years ago

The most important aspect of how we do this is making sure we get the interface right. If we do later decide to change behind the scenes how the code deals with the detached simulations, MPI or data transfer for c++ to python we don't want it to end up changing the front-facing code...

giovanni-rosotti commented 8 years ago

See my last commit, what do you think of the interface? I've created a run_async function which returns a Popen standard python object. Essentially this provides two methods: poll and wait, that you can use to know if it has finished (poll) or wait until it has. With MPI, I created a dummy popen object that replicates the same functionality (but requires MPI3, otherwise I can only provide wait). I'll add some explanation in the documentation.

rbooth200 commented 8 years ago

I'm more or less happy with this solution. I have a few of questions: 1) Does it make sense to allow the run id to be specified in the call signature of run_async? Should we require it? 2) Is the run_async() followed by loadsim workflow natural when scripting? In the tests we save the run id which makes this easy, but does the user in a scripting environment always know it? 3) Should we remove the sim from the database of simulations that python keeps or not?

giovanni-rosotti commented 8 years ago

I am not sure I know the answers to all of them, but here are my thoughts:

Run_id will always have a default of course. But maybe it's good to force the user to remember it.
This is connected to the previous one: if the user is forced to remember the run_id, then asking the user to specify the run_id is not a problem... I can also save the run_id in the Popen object, but I'm not sure it's the right thing to do.
This is the only one where I have a strong opinion: no! The user might have a reference to that object somewhere else, and things might go completely wrong then if it's removed from the simulations we are keeping track of. In this way, the worst it can happen is that the user tries to use data associated to this object, and he will wonder why there is none. But we can explain it in the user guide, and it will be a very clear error, not something strange as if we were to remove the object...

giovanni-rosotti commented 8 years ago

Ok, this branch is now almost ready to be merged. There are two things remaining: a) fix the last crash. This is puzzling because on my laptop I get a smaller error norm than on travis. I need to understand what is going on. b) add the jeans test. I tried to run the jeans.dat file in gravhydro_tests and I get timesteps of 0 (!). Using the hllc solver at least the simulation runs, but the result is crap compared to the analytical solution. I can see that the resolution is really low - is it just a matter of increasing it (and in this case, what would be a sensible one that does not take ages), or do I need to change anything else? In any case, I can't stress how much the parameter files we provide NEED TO BE CORRECT!!!!! I created an issue (#110) to remember ourselves to go through them.

giovanni-rosotti commented 7 years ago

Sorry for spamming you this afternoon guys... but debugging in this way is no fun! Anyway, the mess was entirely my fault - never call something 't' when it's already a pretty important name... I will now rewrite the history to clean up the mess I have done - it's best if you delete your local copy of this branch and recreate it. David, it would be great if you could let me know what parameters you used for the jeans test; it's the last thing missing from this branch.

dhubber commented 7 years ago

Yes, 't' was definitely used elsewhere! ;-) So, I just actually pushed my most recent changes to icsilcc. The jeans.dat file in that branch was working for me so you can just pull and use that one to see if it still works in your branch!

rbooth200 commented 7 years ago

What's the current status of this pull request? Is it now ready to merge? I have no concerns about merging it with master - in fact quite the opposite. If it's working then the sooner the better.

I have only a question: Should run_async optionally take a simulation parameter file name? Also should perhaps it should forward any extra **kwargs to the setParams method. These two things aren't essential since there are simple work arounds, but I think they would be useful to have and don't see a reason why not.

giovanni-rosotti commented 7 years ago

The jeans test is still missing... I'm working on that.

Regarding your question, the interface should be the same as run, which takes a simulation number instead. Otherwise it defaults to the current one (which currently is the only option with run_async). I'm not sure about kwargs - could be useful. But if we do it then it should be added also to run.

rbooth200 commented 7 years ago

Ok. I agree that it should be the same as run. Given the load - set parameters - run workflow of run and that run async will have more or less identical workflow I think there is no need to add the keywords.

giovanni-rosotti commented 7 years ago

Regarding the jeans tests, it now runs also with the exact riemann solver (not sure what changed btw). But the results are crap, see attachment (produced with the jeanstest script). David, does it work correctly in your branch?

jeans

dhubber commented 7 years ago

ok, I got results a bit like that when I used the tree with normal parameters (e.g. thetamaxsqd = 0.1) because it was not computing cell forces correctly with the Ewald correction. This is something I think Richard has mentioned a couple of times and I thought he mentioned he prepared or fix (or maybe was planning on doing it, so maybe not yet)? Anyway, reducing the opening angle (in effect forcing the gravity to brute force) improves it considerably. However, I thought this parameter file already did that so if it's still doing that, then it's a bit concerning. I'll have a look now to see if I get the same in my branch.

rbooth200 commented 7 years ago

I've run the jeans test with the brute force tree and it seems to work well in that case, but perhaps not with the tree. I don't think this should be surprising since the gravitational force is close to zero and we're essentially computing a small perturbation.

I think we should the jeans test with brute force, perhaps in 1, 2 and 3D (will this work?).

dhubber commented 7 years ago

No, the Ewald force will only compile in 3D (1d or 2d should throw an exception I believe). It will work with 1d, 2d or 3D periodicity (but still only compiled in 3D).

On Thu, 1 Dec 2016 at 20:54, Richard Booth notifications@github.com wrote:

I've run the jeans test with the brute force tree and it seems to work well in that case, but perhaps not with the tree. I don't think this should be surprising since the gravitational force is close to zero and we're essentially computing a small perturbation.

I think we should the jeans test with brute force, perhaps in 1, 2 and 3D (will this work?).

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/gandalfcode/gandalf/pull/109#issuecomment-264276467, or mute the thread https://github.com/notifications/unsubscribe-auth/AEsg1QR9Aq3_ZhKn0lx-6awSG-L44Z3iks5rDyYAgaJpZM4K1RK3 .

rbooth200 commented 7 years ago

How difficult would it to be fix?

rbooth200 commented 7 years ago

Actually, I'm worndering if this test would work well with a relative force accuracy opening criterion. It would make more sense I guess.

dhubber commented 7 years ago

No, it's only designed (algorithmically) to work in 3D. And if it were to be redesigned for 1d/2d, it would best be done by the guy that wrote it, i.e. Not me :-)

Maybe a relative force criterion could work better although I think the problem is still more likely to do with how we open and compute each cell.

On Thu, 1 Dec 2016 at 21:11, Richard Booth notifications@github.com wrote:

Actually, I'm worndering if this test would work well with a relative force accuracy opening criterion. It would make more sense I guess.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/gandalfcode/gandalf/pull/109#issuecomment-264280975, or mute the thread https://github.com/notifications/unsubscribe-auth/AEsg1RZS0ZVPkaElpZSKm2VZA-LuClHyks5rDynygaJpZM4K1RK3 .

rbooth200 commented 7 years ago

Maybe a relative force criterion could work better although I think the problem is still more likely to do with how we open and compute each cell.

I agree that it's to do with opening each cell. A relative opening criterion is unavoidably more accurate when the forces nearly cancel, this is after all why gadget and the cosmologists use it for early universe stuff when the pertubations are small.

The gravitational force calculation should be correct, this branch has the fix included.

dhubber commented 7 years ago

ok, thinking about it, we should do this test in 4 cases to check that it is working fine. i) lambda >> lambda_jeans (i.e. very gravitationally unstable), ii) lambda > lambda_jeans + epsilon (i.e. slightly gravitationally unstable), iii) lambda < lambda - epsilon (i.e. retarded/slowed-down sound waves), iv) lambda << lambda_jeans (i.e. negligible gravity, pure sound waves). If the two marginal cases have noise, then that's not a problem since it's expected. But if the highly gravitationally unstable case also has noise, then there is something wrong perhaps with tree + Ewald, since there'll be no other reason for any problems then. And hopefully the pure sound wave case which be just that with minimal noise.

Btw, regarding the above plot for the acceleration, the MFV scheme does/will not have the correct acceleration since the line is the sum of the grav + hydro and we only compute the grav. acceleration. The hydro acceleration can perhaps be easily inferred from the flux (momentum change / dt / mass) for plotting purposes but is not currently done. But it will be correct for the SPH case since that includes both hydro and grav acceleration components.

rbooth200 commented 7 years ago

I also agree that we should have at least a couple of test cases here. But the noise is not due to whether or not the system is stable, it is due to the density perturbation amplitude being much smaller than the average density. In the highly stable case the gravitational force error will be just as large, but it will not affect the solution much because it is swamped by the pressure forces.

rbooth200 commented 7 years ago

I'm accepting this because it's too useful to delay it for only the Jean's test. I'm creating an issue for that test instead.

gandalfcode / gandalf

Test correctness #109