saultyevil commented 3 years ago

The initial tests for the spectra generated for an TDE model show that unfortunately it does not seem to produce the same results. But this has gotten better with fixes James' has put in, and it's not far off now. Attached below are spectra showing the differences the parameter file.

The key problem is that the optical continuum is not the same. This could be stress-testing the new method as the optical continuum is made up mostly of He II (and a bit of He I and H I) BF photons.

The first thing to point out is that neither simulations reach the same stage of convergence; but nonetheless, are still similar. So this simply could be the reason to explain the discrepancy:

matrix:
Cycle 15 / 15: 61.80% of cells converged and 25.90% of cells are still converging

mc_jumps:
Cycle 15 / 15: 55.40% of cells converged and 22.00% of cells are still converging

I said I would do a few things last telecon to help figure this out:

Make this issue.
Run the models to the same stage, i.e. both to be well converged (this may take some time for the mc_jumps method).
Compare the ionization/temperature structure.
Check the macro_abs numbers between the methods.

I will try to do this in the down-time between thesis writing. The matrix version is relatively cheap to run, compared to mc_jumps (which is the thing which will hold back timely debugging/verification).

tde_opt_cmf.txt

accelerated_i_60

accelerated_i_60_opt

https://www.dropbox.com/sh/k8wh2cl2kew4dmx/AABH5EIDymb8t46LLPiztau8a?dl=0

kslong commented 3 years ago

As @saultyevil has noted the matrix and jumps approach are not converged to the same level. This means that the ionization structure in the two approaches almost has to be different. For macro atoms, we record macro atom interactions in the WCreated column of the total spectrum. Here is a comparison of what that looks like. As one can see, the differences in the created spectra are not confined to the continuum. This seems reasonable if the ionization of the plasma is not identical in the two cases.

WCreated

kslong commented 3 years ago

This might be a clue. There are differences in Cycle 1, in the amount energy lost to photons with too many scatters

Here is what we get in for the matrix method in the summary of what happened to various photons

!!python: photon transport completed in: 4585.703649 seconds
!!python: Total photon luminosity after transphot  4.383500894878e+48 (absorbed or lost  4.383456942684e+48). Radiated luminosity 9.972507546569e+46
!!python: luminosity lost by adiabatic kpkt destruction 0.000000000000e+00 number of packets 0
!!python: luminosity lost to low-frequency free-free    0.000000000000e+00 number of packets 0
!!python: luminosity lost by being completely absorbed  7.905651792463e+36
!!python: luminosity lost by too many scatters          4.283775819405e+48
!!python: luminosity lost by hitting the central object 0.000000000000e+00
!!python: luminosity lost by hitting the disk           0.000000000000e+00
!!python: luminosity lost by errors                     0.000000000000e+00
!!python: luminosity lost by the unknown                0.000000000000e+00
spectrum_create: Fraction of photons lost:  0.00 wi/ freq. low, 0.00 w/freq hi

and here is what we get with jumps

!!python: photon transport completed in: 25760.068506 seconds
!!python: Total photon luminosity after transphot  2.750728024743e+48 (absorbed or lost  2.750684109992e+48). Radiated luminosity 9.152056717496e+47
!!python: luminosity lost by adiabatic kpkt destruction 0.000000000000e+00 number of packets 0
!!python: luminosity lost to low-frequency free-free    0.000000000000e+00 number of packets 0
!!python: luminosity lost by being completely absorbed  8.437592709612e+36
!!python: luminosity lost by too many scatters          1.835522352985e+48
!!python: luminosity lost by hitting the central object 0.000000000000e+00
!!python: luminosity lost by hitting the disk           0.000000000000e+00
!!python: luminosity lost by errors                     0.000000000000e+00
!!python: luminosity lost by the unknown                0.000000000000e+00
spectrum_create: Fraction of photons lost:  0.00 wi/ freq. low, 0.00 w/freq hi

This is also quite evident if one looks at the numbers of times a photon has suffered a resonant scatter (also in the .diag files for the very first cycle).

Note that what is here refers to the files which @saultyevil distributed; I have not re-run them and it's not clear to me whether anything has changed in the interim.

kslong commented 3 years ago

@saultyevil One thing I do not understand about the material in the previous comment is that MAXSCAT is set to 2000, and there are continuum scatters that seem to go out to close to this value, but not many in either case. But the distribution of resonant scatters is quite odd, very flat in the first ionization cycle, and the maximum number of resonant scatters is less than 100. Others should look at the diagnostic files for the two runs in the first cycle, when the ionization structure is un-modified.

saultyevil commented 3 years ago

I have re-run those two models without the BF upweighting scheme -- for those unfamiliar, it means I have re-compiled Python with #define BF_SIMPLE_EMISSIVITY_APPROACH 0.

The reason why I said we should re-run with the old approach is because with the new approach, when we are not converged we can get a huge imbalance in photon weights before and after photon transport -- this is something we talk about in Issue #648 -- and I think is adding another layer of mess when we are trying to debug. It could also be an explanation for why the two approaches result in different ionization structures/spectra.

To make this problem clearer: the input luminosity is ~4e43 erg/s. In the previous simulations (two posts up), the photons luminosity transport is~2e48 erg/s, which is about 5 orders of magnitude more than what we input! You can also see that the radiated luminosity in the matrix scheme is about an 10x less than the radiated luminosity in the jumps scheme. So that is why I think it is not helpful to have this scheme turned on to debug this issue.

So now to look at the models with the old BF approach. If we focus first on just the first cycle, where we don't the ionization state is the same, then the two models look to be in pretty good agreement. Putting the same output as above:

Matrix

!!python: Total photon luminosity before transphot 4.391475066404e+43
!!python: Total photon luminosity after transphot  5.873667105995e+43 (absorbed or lost  1.482192039591e+43). Radiated luminosity 3.137322616881e+43
!!python: luminosity lost by adiabatic kpkt destruction 0.000000000000e+00 number of packets 0
!!python: luminosity lost to low-frequency free-free    0.000000000000e+00 number of packets 0
!!python: luminosity lost by being completely absorbed  0.000000000000e+00 
!!python: luminosity lost by too many scatters          2.736344489114e+43 
!!python: luminosity lost by hitting the central object 0.000000000000e+00 
!!python: luminosity lost by hitting the disk           0.000000000000e+00 
!!python: luminosity lost by errors                     0.000000000000e+00 
!!python: luminosity lost by the unknown                0.000000000000e+00 
spectrum_create: Fraction of photons lost:  0.00 wi/ freq. low, 0.00 w/freq hi

Jumps

!!python: Total photon luminosity before transphot 4.391475066404e+43
!!python: Total photon luminosity after transphot  5.777905303561e+43 (absorbed or lost  1.386430237157e+43). Radiated luminosity 3.194193482418e+43
!!python: luminosity lost by adiabatic kpkt destruction 0.000000000000e+00 number of packets 0
!!python: luminosity lost to low-frequency free-free    0.000000000000e+00 number of packets 0
!!python: luminosity lost by being completely absorbed  0.000000000000e+00 
!!python: luminosity lost by too many scatters          2.583711821143e+43 
!!python: luminosity lost by hitting the central object 0.000000000000e+00 
!!python: luminosity lost by hitting the disk           0.000000000000e+00 
!!python: luminosity lost by errors                     0.000000000000e+00 
!!python: luminosity lost by the unknown                0.000000000000e+00 
spectrum_create: Fraction of photons lost:  0.00 wi/ freq. low, 0.00 w/freq hi

The first cycle for the jumping scheme was incredibly expensive, but has been faster for the subsequent cycles. So I am hoping the spectral cycles will finish by the end of the day and have more data available to debug.

This is a bit of a digression from the bug we're talking about, but I wonder if we should also make the BF_SIMPLE_EMISSIVITY_APPROACH variable a run-time flag or parameter in the future? We have evidence here showing it can be a pain in the arse. Re-compiling and juggling between two versions to depending on if you want to run with/without the scheme is annoying and easy to make a mistake.

jhmatthews commented 3 years ago

This is a bit of a digression from the bug we're talking about, but I wonder if we should also make the BF_SIMPLE_EMISSIVITY_APPROACH variable a run-time flag or parameter in the future? We have evidence here showing it can be a pain in the arse. Re-compiling and juggling between two versions to depending on if you want to run with/without the scheme is annoying and easy to make a mistake.

I think this makes sense, yes Ed. I don't really know why it was a compiler flag tbh, probably just because I wanted to have the code excised neatly, but anyway I agree it should probably be a run-time flag.

saultyevil commented 3 years ago

Now both models have finished. The disagreement is still there, so nothing in the interim has helped but this has also removed the problem with crazy photon properties as the luminosities are much more sensible.

Both models are practically at the same level of convergence, but not sure if that means they ended up in similar ionization states.

matrix

Cycle 10 / 10: 52.00% of cells converged and 16.80% of cells are still converging

jumps

Cycle 10 / 10: 51.00% of cells converged and 15.40% of cells are still converging

Here is a link to the completed models,

https://www.dropbox.com/sh/rtnq60rybqw270b/AAAwOV3UzFgi8huXperEuu6Fa?dl=0

saultyevil commented 3 years ago

Quite clearly they've converged to different ionization states. Plotted below is the log(T_e_matrix - T_e_jumps), i.e. Delta T between the two models.

So the difference in spectra is likely an effect of the difference in ionization structure. Naively I would have expected the two methods to reach roughly the same state...

kslong commented 3 years ago

OK, so to summarize, at least from my perspective

For this model we converge to different values of temperature in the wind depending on whether we use the state machine or jumps approach
Since the ionization calculation is the same for both approaches, the difference in ionization has to reflect a differece in the estimators that are constructed during photon transfer.
The only way this can happen is if there is a (statistically important) difference in macro-atom interactions during radiative transfer.

If all this is true, it suggests that we can diagnose what is happening either by using a single ionization cycle with this model, or by importing a similar model into Python. Ideally of course, we would find a single shell version of this, but that may not work, as the spectra in these cells are likely to be complex (though I guess we might be able to dump the spectrum in one of the problem cells and use that for the exciting spectrum.)

As an aside, to the point @saultyevil has raised about a compiler flag, I too favor not having to recompile, but would suggest this be a diagnostic input.

kslong commented 3 years ago

@jhmatthews @ssim One thing that is similar in Ed's new run with BF_SIMPLE_EMISSIVITY_APPROACH set to avoid upweighting (to his previous result) is that in the WCreated spectrum the lines are still stronger in the mc_jumps than in the matrix approach, while the continuum is stronger in the matrix approach, as shown below:

WCreated

So naively, this suggests that the jumps and matrix approaches yield different probabibilities of de-excitation via bf/ff and bb. Does that make sense.

Note - I don't see how we are going to solve this problem except either by "pure thought" or by finding an example that we can explore in more detail. Does anyone have an idea of how to proceed?

kslong commented 3 years ago

It was a long shot, and did not work, but I have rerun Ed's models with the latest dev changes merged in the the matrix branch with upweighting turned off. One gets the same result as previously, namely that the matrix approach produces more lines and less continuum than the jump approach. Here is the figure that shows the WCreated spectrum during the last ionization cycle:

redo

kslong commented 3 years ago

And here is another test, I did. I took Ed's two versions of the models, and for each version, I calculated an extra ionization cycle twice, once with matrix and once with jumps. That is to say, I started Python using the previous options, and ran one more cycle.

So for example here is the history

Run1 - 10 cycles with jumps + 1 cycle with jumps, so eleven cycles in all
Run2 - 10 cycles with jumps + 1 cycle with matrix

I then looked at the WCreated spectrum for these two runs. They are identical within the errors. Similarly

Run3 - 10 cycles with matrix + 1 cycle with jumps
Run4 - 10 cycles with matrix + 1 cycle with matrix

Again the results of runs 3 and 4 look the same. (Of course Runs 1 and 3, etc look different)

This was a surprise, but it seems to suggests that if everything about the wind is fixed,

radiative transfer is actually the same, but
in the process, of calculating the ionzation something changes, e.g as the estimators are some how not the same.

Does this seem possble to anyone?

kslong commented 3 years ago

For what it is worth, which is probably very little, I tried running this model with on.the.spot ionization. I thought that possibly some of the estimators were different as calculated with jumps and matrix, and this might be leading to a different ionization state. The results for the matrix and jump method for the total created wind spec are still different. The lines are less obvious in with this version, but it is still true that the continuum in matrix mode tends to be higher than for jumps.

test

Aside: In the on.the.spot mode, convergence is not being calculated properly, as every cell gives a value of 1 for every cycle. This probably should be a separate bug.

kslong commented 3 years ago

@jhmatthews @smangham This is an aside to the main problem here, but in the current version of macro atom, there are a few things I do not understand, in this portion of the code (starting around line 200):

        /* if it's a bb transition of a full macro atom  */
        if (*nres > (-1) && *nres < NLINES && geo.macro_simple == FALSE && lin_ptr[*nres]->macro_info == TRUE)
        {
          n_jump = matom (p, nres, &escape);

          if (escape == TRUE)
          {
            /* It escapes as a r-packet that was created by de-activation of a macro atom.
             */
            *which_out = MATOM;

            /* Update the the photon origin to indicate the packet has been processed
               by a macro atom */
            if (p->origin < 10)
              p->origin += 10;
            return (0);
          }
        }

        /* if it's bf transition of a full macro atom. */
        else if (*nres > NLINES && phot_top[*nres - NLINES - 1].macro_info == TRUE && geo.macro_simple == FALSE)
        {
          n_jump = matom (p, nres, &escape);

          if (escape == TRUE)
          {
            /* It  escapes as a r-packet that was created by de-activation of a macro atom.
             */
            *which_out = MATOM;
            /* Update the the photon origin to indicate the packet has been processed
               by a macro atom */
            if (p->origin < 10)
              p->origin += 10;

            //If reverb is on, and this is the last ionisation cycle, then track the photon path
            if (geo.reverb == REV_MATOM && geo.ioniz_or_extract == CYCLE_IONIZ && geo.fraction_converged > geo.reverb_fraction_converged)
            {
              line_paths_add_phot (&(wmain[p->grid]), p, nres);
            }

            return (0);
          }
        }

This is part of the code for the "jumps" approach.

It looks like there are two sections, one to deal with bb and one to deal with bf. But the code, with the exception of the bits about reverb is identical in the two sections. Should we fix the code so it does not have the duplication?
Why, in reverb, is line_paths_add_phot only called for bf @smangham

Separately in the matrix/state machine version, I don't see any calls to line_paths_add_phot. Does this mean that either, we cannot use the matrix approach for reverb or is this something that needs to be fixed?

smangham commented 3 years ago

Hmm. It looks like those sections were the same all the way back to 2015, so I didn't split them up.

So, this is only used for the bit where we track the path distributions giving rise to photons of one specific line. I think my logic was bb emission in a line was probably resonant scatter where the energy never enters the thermal pool and so wasn't important when considering the delay-distribution of photons that contribute to the thermal pool and influence bound-free emission from it. But that's not the way the macro-atom process works, so it should be in both.

The line-specific delays stuff turned out to be a bit pointless as the delay distribution of photons activating a given line didn't seem to differ meaningfully from that of those just heating up the wind in general, and I don't think this bug would have affected that.

The matrix mode can't be used for line-specific delays but that's in the docs (the mode is called matom) along with the commentary that it turned out to be not particularly useful. There's a small section on it in the never-published Methods paper with a couple of figures illustrating that, but I think that's pretty much it for the line-specific/matom reverb mode.

kslong commented 3 years ago

Thanks, @smangham, for the comments. I think James is going to look at making sure that the matrix and jumps modes do the same thing in this regard.

kslong commented 3 years ago

To see whether there was some interaction between simple atoms and macro atoms that was producing the problem, I have run the tde sample case with the data file, h10_hetop10.dat. The results are the same. The matrix mode shows a stronger continuum; the jumps mode shows stronger lines.

kslong commented 3 years ago

@jhmatthews @saultyevil Someone should repeat this test (because I now have so many versions/branches of Python lying around), but the test I did to compare the detailed spectra produces with the accelerated macro scheme vs the jumps scheme, I find differences. (Specifically what I did was the following, I started with the tde model, using just the pure macro atom data file, that had been run for 10 ionization cycles and 5 spectral cycles. I then ran another model where this was the previous model, for no ionization cycles but 5 spectral cycles, with a version of the code with the accelerated macro scheme turned off, so that we back to the old approach. Both models were run with upweighting turned off. Here is the result, I get for the WCreated spectrum:

emissivities

As was the case previously, using jumps produces more lines, and less continuum. This suggests that whatever differences there are, it infects the calculation of the emissivities as well as whatever is happening in ionization cycles. Hopefully, this will be a clue to what is going on.

Note that it is important to add that this is all most apparent in the WCreated spectra. It's less obvious at least in this case in the extracted spectra, presumably because so much of this emission is absorbed.

As an aside, I fundamentally agree with Ed that we should not use the pyhon.h file to select options. We should always use our input .pf files, because (a) this provides a record of what has been done, and (b) it means you don't have to keep creating new branches when you want to create lots of options.

jhmatthews commented 3 years ago

Thanks Knox, that's useful. It might even mean we can produce differences with a simpler model.

I agree on python.h and input files. The only reason for putting stuff in there really was because we were trying out some beta scheme, or because you want something to become the default but have the old behaviour as an option, but possibly that was misguided.

kslong commented 3 years ago

Mea culpa @jhmatthews . As I indicated, someone needed to repeat the tests associated with emissivities in jump vs matrix mode for the same windsave file. I have now done this more carefully. There are only statistical differences between (WCreated or extracted) spectra produced for the same windsave file made from tde_opt_cmf.pf, whether one calculates emissivities with the old jump mode method or the new accelerated method. This statement is true both for a pure macro atom datafile and for one with both simple and macro atoms. These tests were made with the current version of python in the merged_matom_matrix branch, which is up to date with respect to dev.

kslong commented 3 years ago

Well, at some point, I may get this right @jhmatthews I have discovered that when you read in the previous windsave file, it is not enough to read in either mc_jumps or matrix from the .pf file, one must also update the varible in plasma that tells one what to do. This is because you cleverly gave us the option of using matrix and jumps modes for different cells, but when you read in the previous wind file, so the commend directly above should be ignored. Sigh

kslong commented 3 years ago

I have found a shorter set of .pf files that demonstrate the problem. These are stellar models with an AGN like spectrum. They are pure H macro atom models and so are fairly simple to run. The two models differ only in that one uses jumps and the other uses matrix mode. They should be run with the -d option if you want the details of each run. Running in multiprocessor mode, e.g -np 16, the WCreated spectrum after 3 ionization cycles shows clear differences between the jump and matrix version. There is a script DoYShell in the file to run the two models; depending on the machine one will want to change the value associated with np. The branch of Python that should be used is merged_matom_matrix.

z.foo.210930.tar.gz

Recently however, I have discovered that the differences between jump and matrix mode go away if the matrix version is run in single processor mode. The problem must be associated with this, though why that is the case I do not know. Hopefully, someone can figure this out.

kslong commented 3 years ago

@jhmatthews I have been exploring under what situations there were differences when the model discussed above were run in multiprocessor and single processor mode.

On the merged_matom_matrix branch, I find that running there are differences when running in matrix mode with 12 threads and when running in single processor mode. I also find differences when runnin g in jump mode when running iwith 12 threads and with one, that is without mpirun.

This led me to question whether I would see the same problems on the dev branch, which does not have any of the accelerated macro tools added, and I am finding that there are differences when running with different numbers of threads there as well.

Associating the problems that we are seeing with the matrix mode seems to have led us/me down the proverbial rat hole. I would appreciate it @saultyevil if you could verify what I have stated here using this .pf file

yjump.pf.txt

Higginbottom commented 3 years ago

I have run this parameter file on 25 and 26 cores - not sure I see much of a difference... Created_ratio Created Emitted_ratio Emitted WCreated_ratio

saultyevil commented 3 years ago

Knox asked me to confirm that running Python with a different number of cores created difficult spectra.

I ran yjump.pf using 1 core and 12 cores, using the "merged_matom_matrix" branch -- but I guess I should have used dev instead. Plotted below are both spectra, which are quite clearly different at higher frequencies!

yjump

I've had a quick look at the properties of the winds, and it seems that the model run with 1 core is in a higher ionization state. We've discussed this earlier in the thread, but I thought it would be useful to know (again).

@Higginbottom I wonder if the difference isn't that big between models run with a small core difference? I wonder what 1 vs 2 cores looks like? Is there a critical number of cores "difference" before we see a difference in the spectra?

There is a small difference in WCreated for the .spec_tot file, which I think implies there is something going on in the ionization cycles.

Higginbottom commented 3 years ago

50 vs 25 is also the same. Are you running a single core? It seems like it will take hours, so that's why I wanted to test multiple core runs for differences..

saultyevil commented 3 years ago

Okay, that's interesting. But yeah, I ran using a single core and it took 11 hours on Excession. I think you're using the latest dev, right? I think I'll try again with that instead of the branch I used.

Higginbottom commented 3 years ago

Theres not much point us both doing the same thing - if you have the time to run with this then I'll step back and do my stuff.. But if you'd rather I have a run at it then that's fine as well...

saultyevil commented 3 years ago

Oh sure, go ahead! I was only trying to follow up on what Knox asked me to do :)

Higginbottom commented 3 years ago

NP - I thought he'd asked for volunteers, and I'd put my hand up :-)

kslong commented 3 years ago

So, my summary of all this, @saultyevil and @Higginbottom is that if you compare two runs with a large, but slightly different number of cores, then there is very little difference, but if you compare a run with a large number of cores to one with a small number of cores then you see a more substantial difference. Nick, if it easy, perhaps you could try comparing, say 25 and 50 cores to see if it the ratio of the number that counts. That would make it look like a normalization problem, i guess.

Higginbottom commented 3 years ago

Here is 50 vs 25 cores... Created_ratio Created Emitted_ratio Emitted WCreated_ratio I'm currently running 1 and 2 cores on excession - but they won't finish till tomorrow AM...

kslong commented 3 years ago

Just to make sure we are all on the same page. Here is what I get making a comparison between 8 and 12 processors using the current dev branch. Not that my plots are all semilogx nu-F_nu plots made from the log_spec_tot file. This allows one to see differences more easily than log log plots. You should be able to make similar plots with the script I have attached, assuming you have matplotlib installed. You can call this from a Jupyter notebook, or run it from the command line. If you do, you just give it the rootname of the two runs you want to compare. It probably makes sense to use the same plots, to ease comparison of different models.

compare_ion.py.txt xsmooth.py.txt

To make this run, you will remove the .txt extensions and make sure the .py files are excecutatble. The command line version is

compare_ion.py root1 root2

The plot will be in diff.png

diff

Higginbottom commented 3 years ago

Ah - OK - so you are plotting the log spec tot files. i.e. not the results of the spectral cycles... That will speed things up a lot. I assumed you were plotting the .spec files....

kslong commented 3 years ago

Yes, because this is most apparent in the wind photons created during the ionization cycles.

Higginbottom commented 3 years ago

Jolly good - here is 50 vs 25 using your script.. diff I'm running 8 cores, 12 cores, 1,2,3 cores so we should be able to see if I can reproduce exactly your observation, along with other permutations... I have to say, one would expect to see a difference in 50 vs 25 if you see one with 8 vs 12. But who can say!!

kslong commented 3 years ago

Agreed. Clearly the differences are smaller with more cores, whatever that means. Note if you run my script in a Jupytter notebook, you can change the ylimits to expose differences in the green (difference curve).

Higginbottom commented 3 years ago

Ooooooooo, 2 cores vs 50 cores... diff

kslong commented 3 years ago

Yep ... so this says, and I agree from my tests, that it's not a single processor vs mulitprocessor issue -- but a different number of cores issue.

kslong commented 3 years ago

For those of you working with the simple examples above, I have fixed #906, so that you can set the number of spectral cycles to 0, and still get a WCreated spectrum in the ionization cycles. As explained there, we had messed this up, when we decided to make the number of bins in the spectral cycles user definable. Basically, if there were no spectral cycles then the number of bins in the WCreated spectrum was never set. This should no longer be a problem.

Higginbottom commented 3 years ago

OK, back at it with a bunch of runs from overnight. 1: I can confirm "by eye" that I get the same results of 8 vs 12 cores... diff

Higginbottom commented 3 years ago

Ok, so here is a plot of the difference in WCreated for a range of cores relative to one core... comp_many this is unbinned - KSL's plotting routine bins over 21 bins, here is the plot with that binning... There does seem to be some kind of scaling with number of cores, but not simple... zooming in on the 2.465e15 Hz feature - unbinned As you add more and more cores, it converges to something. Presumably a wrong answer - since we must assume one core is in some ways correct. Hmmmmm

kslong commented 3 years ago

@Higginbottom I think this is helpful Nick. To me it suggests that this really is some kind of normalization by thread issue. I don't know if you preserved the windsave files for each of these runs, but it might be interesting to look at something like the mean t_r for the grid in each of these cases, and plot that as a function of the number of threads.

kslong commented 3 years ago

I should have added that with the changes I made to dev yesterday, you can reduce the number of cycles to two, without any spectral cycles, and still see changes. The changes all begin to manifest themselves once the ionization has been calculated at the end of cycle1, and so you see it in the spectrum in cycle 2.

Higginbottom commented 3 years ago

I did save the wind saves. Running wind save2table, and plotting t_r (n_cores>1) - t_r (n_core=1) we do see something: comp_t_r

Higginbottom commented 3 years ago

The "ip" field in the master.txt shows a most interesting trend.... comp_ip Surely this must be a very useful clue..

lazygun37 commented 3 years ago

Are the ratios constant?

On 06/10/2021 15:44, Higginbottom wrote: CAUTION: This e-mail originated outside the University of Southampton.

The "ip" field in the master.txt shows a most interesting trend.... [comp_ip]https://user-images.githubusercontent.com/3329213/136226584-9427c80e-458f-4638-a37a-b1e599bbe05a.png Surely this must be a very useful clue..

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/agnwinds/python/issues/898#issuecomment-936400133, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABEXZAEJJK5TXUHT5X2I6G3UFROGDANCNFSM5CMQB35Q. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

--

=================================================== Professor Christian Knigge Physics & Astronomy University of Southampton Southampton SO17 1BJ

Higginbottom commented 3 years ago

Not quite - but close: comp_ip

kslong commented 3 years ago

So, I created a new branch xdev, which generates a version py85x and has the following features

All threads start with same random seed, so if one starts with the same initial conditions in each thread, one would expect photons to do exactly the same thing
Writes the windsavefiles in each thread to a file, if the diagnostic flag to write windsave files every cycle is set

So far this has not revealed anything exciting, you all may wish to use it for further debugging.

In particular, when a mpirun is made, the windsave files from the different threads are different in a binary comparison, but all of the threads create the same windsave to table files. I have not yet found what does differ between the files with the different threads. If there were problem in thread to thread communications, I might have expected something to show up here.

If you run our standard yjump.pf file (for two cycles and no spectral cycles) with say 12 threads and 2 x 10*7 photons compared to 6 threads and 10 7 photons so that each thread in the two runs is processing the same number of photons, there are no obvious differences in anything after cycle 1 in the windsave2table files that you would not expect. For example, all of the parameters derived in the spec.txt files are the same, except the ones that have to do with the total number of photons, and these are exactly a factor of 2 different. (In cycle 2 one does get differences).

Note that windsave2table does not print out the macro atom estimators, so this is something I have not checked. My working assumption is that during cycle 1, the the radiative transfer is the same whatever the number of threads and that as a result the ionization calculation produces the same new ionization at the end of cycle 1. However, during cycle 2, I think the radiative transfer is different with different threads. A possibility therefore is that the macro-atom estimators which are used in cycle 2 depend on the number of threads, and this in turn causes the scattering in cycle 2 to be different.

But @jhmatthews and @Higginbottom, am I correct in thinking that if the piecewise power law fits have exactly the same parameters, we would expect the ionization

Higginbottom commented 3 years ago

FWIW - I've run a non macro atom version - just to confirm wether (or not) its a macro issue: diff

comp_many

Seems like its just noise - so it is a macro problem... probably...

I'm also running a set with a blackbody model for J_nu in cells in the macro scheme

Higginbottom commented 3 years ago

Results of matrix_bb runs on 8 vs 24 cores: diff Slightly different actual numbers, but there is still a noticeable difference between the two numbers of cores.

I'm going to carefully go through the code now, so see if I can see any error in the MPI sharing/collecting..

sirocco-rt / sirocco

New matrix macro atom scheme doesn't seem to produce same results for TDE models #898

=================================================== Professor Christian Knigge Physics & Astronomy University of Southampton Southampton SO17 1BJ