The Standard SV model does not work with Kurucz models

kslong commented 6 years ago

In doxygenating continuum.c, I could not understand what it was doing as a number of lines had been commented out which seemed important. The problem I that I am worried about is I don't see how in one_continuum one assures that one is getting a photon from the right t and g, because I don't see where model (which updates the models is being called). It seems like it would have to be called whenever t_eff and g change.

So I tried a standard sv model using kurucz models for the spectra of the star and disk in the ionization cycles

It is currently failing

Here is the .pf file

sv.pf.txt

It's failing in an fprintf statement in model.c that I had put in possibly while working on #301 or #294 but this is most likely more serious than this.

The program appears to have been working for this at commit 37158e323f4dd418e73d8aae29bd0bd81242472a

and was broken by commit 66aa797d3860b93ccb47d70b874f5b9f666f9407 (something done by ksl.

When I comment out the fprintf statement, the program does run, but it generates a large number of errors of the following type are being produced:

99675 -- cdf_gen_from_array: all y's were zero or xmin xmax out of range of array x-- returning uniform distribution %d

100000 -- cdf_get_rand: %d

With a longer calculation this would kill the program.

So I have 3 questions about this:

What is causing fprintf to crash the program in get_models?
Why in one_continuum can one avoid calling model when the T and g change?
What is the source of the cdf_gen errors and how can they be prevented?

Higginbottom commented 6 years ago

I think the error is actually in get_models.cat line 630. I further believe that the segfault is caused by the fact this file is being opened bazillions of times, and never closed. The fact that this routine is being called bazillions of times is something I'm now checking - it seems to me that it should not be being called so often. I further think that the root of the problem is the errors like

model: Only one model after pruning for parameters, consider larger model grid model: data/kurucz91/fp00t50000g50k2.txt 50063.02 5.98 model: Taking corrective action because parameter 50063.019154 outside bound 50000.000000 -> scaling factor 1.040655

Will carry on looking...

Higginbottom commented 6 years ago

OK - so its fine that model (in get_models.c) is called a lot - whenever one has a new temperature or surface gravity to extract a continuum photon from, this is called. But I'm sure this is why it crashes. One can't just keep opening files forever - well I think one can have 256 open, the crash here occurs after the 173rd file is opened!!!

So, commenting out the file opening bit, and moving on to the meat of the problem..

Higginbottom commented 6 years ago

Hmmm, I only get 3112 -- cdf_gen_from_array: all y's were zero or xmin xmax out of range of array x-- returning uniform distribution %d

Higginbottom commented 6 years ago

These errors are coming from when we are asking the code to make disk photons of very high frequency, so in cycle 1 we are supposed to make

photon_gen: band 1.32e+16 to 2.63e+16 weight 1.32e+25 nphotons 10000 ndisk 3025 nwind 0 nstar 6975 npow 0

And we get 3018 errors of the type noted above, exactly when we are trying to make these photons. The Kurukz model goes down to 90angstroms, which is 3e16Hz, which is close to the frequencies requested - I shall check there isn't some kind of rounding - but the other possibility is that the annuli of the disk which make such high frequency photons have temperatures that are above that seen in the Kurukz array - this would explain the errors like

model: Taking corrective action because parameter 50063.019154 outside bound 50000.000000 -> scaling factor 1.040655

sounds like there are annuli in the disk above 50000K, and this is above the highest temperature in the matrix of files (this is true - they only go up to 50000K) In this latter case there is something going wrong with the extrapolation of the kurukz spectra to higher temperature.

Lots to look at..

Higginbottom commented 6 years ago

Hmmm - simple look 1: disk.diag shows no annuli hotter than 7e4...

smangham commented 6 years ago

The fact it appeared when Knox removed an fprintf makes it sound like it might be an array bounds error. If we’re off by 1 and accidentally reading from a piece of memory outside the annulus array, that would explain how the array itself might seem fine but we error when producing photons. Calling fprintf might write a low value to the piece of memory adjacent to the array, so once it’s removed we try reading random memory and usually end up with something that’s too high?

On 15 May 2018, at 13:48, Higginbottom notifications@github.com<mailto:notifications@github.com> wrote:

Hmmm - simple look 1: disk.diag shows no annuli hotter than 7e4...

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/agnwinds/python/issues/377#issuecomment-389154162, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AHt-_FIQaIHQMT743ehr1GFq9iticqHgks5tys6dgaJpZM4T-o_-.

Higginbottom commented 6 years ago

He removed an fprintf, but I think the more important thing he likely removed was an open()....

kslong commented 6 years ago

@Higginbottom is correct about two many files being opened. One can prevent the segfault by adding an fclose after the do loop that writes the file out. So I think we are back to problems 2 and 3.

Higginbottom commented 6 years ago

oh durrr, 7e4 is much higher than 50000!!! In fact a lot of the disk is hotter than the maximum temperature of in the kurikz models. So we need to check how the extrapolation is working - this is also why we get lots of warnings about only one model being used - it is essentially extrapolating from the hottest model all the time....

kslong commented 6 years ago

I have confirmed that one_continuum is not doing the correct thing. It's using the wrong temperature. That part is easy to fix (but I have to be away for a bit)

If you get a temperature higher than the models, what is supposed to happen is that the highest T model in the grid gets adjusted by the ratio of BB.

Higginbottom commented 6 years ago

I've just been looking at the bit of the code where the 'taking corrective measures' error was coming from. I was actually confused at how few errors of this kind there were. Turns out there is a protective if statement around the error that only throws an error 20 times even if the code has to do the extrapolation lots of times. Grrrrrr.

Higginbottom commented 6 years ago

Hmmmm, can't see how one_continuum is using the wrong temperature....

Higginbottom commented 6 years ago

So, the cdf_gen errors are are caused because the extrapolated Kuruz spectrum is all zeros in the last band for the given model. The question is back to wether the extrapolation is working OK.

kslong commented 6 years ago

It's not requesting a new model when T and g change.

Higginbottom commented 6 years ago

Yeah, I see what you mean. But only in the last band.... odd....

Higginbottom commented 6 years ago

I'm really baffled - one_continuum is called with spectype - as its initial variable. This is supposed to point to a suitable model, but as far as I can see it is always set to zero when it is called....

kslong commented 6 years ago

No that is not what spectype refers to. Spectype is a series of models, in this case kurucz. I am about to upload a new version to dev, which will fix this … just leaving the issue of what to do when the temps are out of range.

From: Higginbottom notifications@github.com Reply-To: agnwinds/python reply@reply.github.com Date: Tuesday, May 15, 2018 at 11:42 AM To: agnwinds/python python@noreply.github.com Cc: Knox Long long@stsci.edu, Author author@noreply.github.com Subject: Re: [agnwinds/python] The Standard SV model does not work with Kurucz models (#377)

I'm really baffled - one_continuum is called with spectype - as its initial variable. This is supposed to point to a suitable model, but as far as I can see it is always set to zero when it is called....

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/agnwinds/python/issues/377#issuecomment-389234171, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACaeVeOotc2n0QMkQxoyRv3OLVd-4nnxks5tywWFgaJpZM4T-o_-.

kslong commented 6 years ago

At this point, I believe questions 1 and 2 are addressed, but 3 still remains.

Higginbottom commented 6 years ago

jolly good - wasted a day trying to understand this. Guess I should have left it to you... :-(

kslong commented 6 years ago

It was not a waste. You got the first part. Still need to figure out about what to do about the cdf errors. It looks like the code was supposed to handle this situation, but obviously it is not.

From: Higginbottom notifications@github.com Reply-To: agnwinds/python reply@reply.github.com Date: Tuesday, May 15, 2018 at 11:50 AM To: agnwinds/python python@noreply.github.com Cc: Knox Long long@stsci.edu, Author author@noreply.github.com Subject: Re: [agnwinds/python] The Standard SV model does not work with Kurucz models (#377)

jolly good - wasted a day trying to understand this. Guess I should have left it to you... :-(

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/agnwinds/python/issues/377#issuecomment-389236572, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACaeVbk-X-cT52pHfq0auti6BlIdiWetks5tywdFgaJpZM4T-o_-.

kslong commented 6 years ago

With the corrections made for questions 1 and 2, the errors associated with

9675 -- cdf_gen_from_array: all y's were zero or xmin xmax out of range of array x-- returning uniform distribution %d

have also disappeared. So I will close this now. I plan to add a test of these routines to our standard regression tests.

sirocco-rt / sirocco

The Standard SV model does not work with Kurucz models #377