PAHFIT / pahfit

Model Decomposition for Near- to Mid-Infrared Spectroscopy of Astronomical Sources
https://pahfit.readthedocs.io/
18 stars 26 forks source link

flux density shift to rest frame is not correct #294

Open jdtsmith opened 4 months ago

jdtsmith commented 4 months ago

https://github.com/PAHFIT/pahfit/blob/45d14975eababec1221af88fa5dad2e74661a726/pahfit/model.py#L330

Because of our "mixed" f_nu vs. lambda units, shifting to the rest frame requires dividing the observed flux density by (1+z), not multiplying by it.

The way to see this is to imagine a top-hat feature with some constant flux f_nu, width dlam, and power P=f_nu dlam. The width corresponds to frequency width dnu = c/lam^2 dlam. When de-redshifting back to the rest frame, both lam and dlam decrease by a factor of (1+z), so dnu correspondingly increases by this factor. Shifting to the rest frame shifts to higher frequency and higher dnu.

To preserve the power measured in our hypothetical top-hat feature, I think we need to divide f_nu by (1+z), not multiply. Mixed units strike again.

drvdputt commented 3 months ago

I looked into this again, with some equations on paper, and I don't think the mixed units are relevant for the line you indicated above. I can't find those notes, so I'm typing it out here for preservation.

We should not try to preserve (or make sense of ) the area spanned by f_nu dlam. That is not a meaningful power quantity in my opinion. To be more explicit, in my power(drude/gaussian) code, power is defined as the "real" power, i.e.

$\int F\nu d\nu(\lambda)$.

The fact that we express the intensity as a function of the wavelength, $F\nu(\lambda)$, is a bit confusing, but it does not prevent us from performing the above integral. We just need to be careful to convert $F\nu$ to $F\lambda$ before doing any numerical integration over the wavelength grid, i.e. always either of $F\lambda(\lambda)d\lambda = F_\nu(\nu) d\nu$.

That being said, here is how I obtained the factor (1+z) in my equation. The quantity that we are converting in the code mentioned above, is simply $F\nu$, i.e. a quantity in MJy / sr on the y-axis, and it should not matter what the x-axis of that quantity is, so to speak. The observed specific intensity spectrum $I\nu^o(\nu)$ then has the following relationship with respect to the rest-frame spectrum $I_\nu^r(\nu)$.

$I_\nu^o(\nu^o) = \frac{dE^o(\nu^o)}{dA d\Omega dt^o d\nu^o}$

with $x= z + 1$, compared to the rest frame $r$

Filling this all in yields

$I\nu^o(\nu^o) = \frac{x^{-1} dE^r(\nu^r)}{dA d\Omega x dt^r x^{-1} d\nu^r} = x^{-1} I\nu^r$

In the above, the equation remains the same regardless of putting $\nu$ versus $\lambda$ between the parentheses. It is simply a different way of addressing the same quantity ($I\nu$ on the y-axis in the spectrum. However, swapping the $\nu$ or $\lambda$ in the denominator does yield a different result (powers of $1+z$), showing that $F\nu$ and $F_\lambda$ don't transform the same with redshift.

jdtsmith commented 3 months ago

We should not try to preserve (or make sense of ) the area spanned by f_nu dlam. That is not a meaningful power quantity.

Agree. And our analytic power formulations all respect this. They are based on $\int f\nu d\nu$ (update: or more aptly $\int f_\lambda d\lambda$) and have been since the dawn of PAHFIT. But notice: what is the functional form of $f_\nu(\lambda)$? It is (for example), a Gaussian in $\lambda$, i.e. with wavelength-based FWHM. That's not even a true gaussian in $\nu$ space; consider the two half-power points.

So it is not the "plotting convenience" of the x-axis using $\lambda$ instead of $\nu$ that mixes the axes, it is this feature of our model: specifying the shape (width, etc.) of our functions in $\lambda$ space, and their amplitude in $\nu$ space. It is for this reason that our powers, derived from fwhm $w$ (in microns) and amplitude $A\nu$ (in some $f\nu$ unit), have an additional factor of $\lambda_0^2$ in the denominator:

$$P = \frac{A_\nu c w}{2\lambda_0^2} (\pi, \sqrt{\pi/\ln 2})$$

for (Drude, Gaussian). Notice, importantly, that two identical features with the same amplitudes $A_\nu$ and widths $w$ appearing at different central wavelengths $\lambda_0$ nevertheless have different powers. You can think of that as the result of the non-linear conversion $A\nu d\nu = A\lambda d\lambda$. I agree with the rest of your derivation, but it does not consider the transformed shape (width) of the function, together with its amplitude.

Executive Summary

Feature powers are $\propto w/\lambda_0^2$, and both $w$ and $\lambda0$ are reduced by $(1+z)$ when going to the rest frame, so $f\nu$ needs also to be reduced by this factor.

Background

Again imagine for simplicity that some combination of dust features in the Universe produced a source with an exact top-hat spectrum and no continuum:

$$ f\nu = C\nu^{(o)} : \lambda_1^{(o)} \rightarrow \lambda_2^{(o)}; 0: \mathrm{otherwise} $$

Gathering the integrated power (in W/m^2/sr, say) would be quite easy in the observed frame (you wouldn't need PAHFIT!):

$$P^{(o)} = \int f\nu^{(o)}(\nu) d\nu = C\nu^{(o)} c \left(\frac{1}{\lambda_1^{(o)}} - \frac{1}{\lambda_2^{(o)}}\right)$$

C.f. the expression above for Gaussian power, which includes an implicit $(\lambda_u-\lambda_l)/\lambda_0^2$. Now we'd like to perform the same operation in the rest frame, so we shift the wavelength down:

$$\lambda_1^{(r)} = \lambda_1^{(o)}/(1+z); \lambda_2^{(r)} = \lambda_2^{(o)}/(1+z)$$

and now we can integrate the power again, this time in the rest frame, as:

$$P^{(r)} = \int f\nu^{(r)}(\nu) d\nu = C\nu^{(r)} c \left(\frac{1}{\lambda_1^{(r)}} - \frac{1}{\lambda2^{(r)}}\right) = C\nu^{(r)} c (1+z) \left(\frac{1}{\lambda_1^{(o)}} - \frac{1}{\lambda_2^{(o)}}\right)$$

By inspection, you can see that to recover the same power no matter whether the integral is performed in the observed-frame or rest-frame, we must set $C\nu^{(r)} = C\nu^{(o)}/(1+z)$. Notice that, had we instead not had mixed units, but specified our tophat with a $C\lambda^{(o)}$, we would have recovered the much more intuitive case $C\lambda^{(r)} = C_\lambda ^{(o)} (1+z)$.

Actual PAHFIT implementation

Now, you may object, this integrals above are not what we are actually doing. And you have a point. We perform the fit in the rest frame, but what matters is how we interpret the fitted values (here, for simplicity, $C_\nu^{(r)}$) after the fact. Since we are (aiming for) power-based fits, presumably each Fitter will need to evaluate power in the rest frame, and have those be correct and valid, and identical to what we would have computed in the observed frame (e.g. a simple gaussian there). Fitter's won't know (or have to care) about redshift. We can break into two cases:

Lines

The FWHM of our unresolved lines is set by an instrument model, naturally in the observed frame. As we discussed in the PR, in the rest frame, these widths are narrowed (by Model): $w^{(r)} = w^{(o)}/(1+z)$. Crucially our line wavelength centers are also reduced by this factor: $\lambda_0^{(r)} = \lambda_0^{(o)}/(1+z)$. Since the Gaussian power (see above) is $\propto w/\lambda0^2$, we accumulate a factor of $1/(1+z) \times (1+z)^2 = (1+z)$. So to preserve power, we need our $f\nu$ spectrum's amplitude to be reduced by this factor, same as above.

Dust Features

This is basically the same case, except for the fact that $w$ for dust (Drude) features is already in the rest frame, so Model does not need to do anything, just pass those value into Fitter. But note, in the observed frame, the feature was "broader": $w^{(o)} = w^{(r)} (1+z)$. So implicitly the same thing is occurring as above — $w$ and $\lambda_0$ are each reduced by factors of (1+z), such that, with no other change, powers would in fact be overestimated by the factor $(1+z)$.

BTW, the fact that dust features have intrinsic widths in the rest frame is the main reason we want to fit in that frame.

So both of these case (and the simple thought experiment above) indicate that we should scale our $f_\nu$-flavored fluxes down by the factor $(1+z)$.

drvdputt commented 3 months ago

Thanks for the write-up JD. I did notice, while implementing the power-features, that the definition of the FWHM as a wavelength quantity makes things less intuitive. I will have to re-read my code, and see if what I did for PowerDrude and PowerGaussian is consistent with the above.

What you wrote down explains how the power of one of our features transforms with redshift. And to obtain your preferred transformation for $I_\nu$, you assume (or require) that the integrated power is preserved.

However, I don't fully understand why you would want to preserve the integrated power. Doesn't the integrated power of a spectrum scale with redshift, physically speaking? It sounds like you are trying to set up some sort of "convenience property" of the data we are fitting: the convenience that fitting the redshift-corrected data would result in the same integrated power value. But which integrated power do we want then? The one in the observed frame, or the one in the rest frame?

jdtsmith commented 3 months ago

In terms of power changing with wavelength, the easiest way to think about it is that all our functions are $f\lambda$ flavored, we just happen to work with $f\nu$ flavored spectra, and so we convert "just in time". Amplitude $A\nu$ becomes $A\lambda = A\nu\ d\nu/d\lambda = A\nu c/\lambda0^2$, and Gaussians/Drude's then have the expected power of $\sim w \times A\lambda$.

you assume (or require) that the integrated power is preserved

It simply must be. Imagine two JWST users examining the same spectrum of a z=1 galaxy with a bright [ArII] line (7µm rest). They see it at 14µm.

Suppose User 1 (observed frame fitter) and User 2 (rest frame fitter) got different values for the integrated intensity of the [ArII] line, and published those values? Which one is correct? Both have done something sensible. There is one and only one true integrated intensity of [ArII] for this galaxy (at the location of the Earth anyway). The only possible resolution is that they should get the same value.

Power certainly scales with redshift, when you consider physically moving a given source with redshift, to a different distance. But that's not what we're doing here. What we're doing is imagining a fictitious source in the rest frame that has just the perfect properties (by our arrangement) to ensure we get the same integrated intensity for the same features as we would have measured them in the observed frame. So rather than an assumption, constant power is an a priori edict. The point of our scaling $f_\nu$ is to achieve it.

drvdputt commented 3 months ago

Ok, then it seems like we had different purposes in mind for the "redshift" utility. My thought process was that the following two cases should yield the same result

  1. The user provides the data as observed, and provides an appropriate redshift value.
  2. The user performs their own redshift correction on the data, and then uses PAHFIT with redshift = 0.

So then the question would be: how does a typical high-redshift observer correct their data. Do they adjust the value of $I_\nu$ in the "physical" way?

We can certainly implement it the way you envision it, and I see how it would be useful. But I don't know which of our two ideas would be the most expected behavior, or should I say: what would provide the "least astonishment" for someone who is experienced with redshifted spectra.

jdtsmith commented 3 months ago

So then the question would be: how does a typical high-redshift observer correct their data. Do they adjust the value of in the "physical" way?

I worried about this too, but it falls in the category of "protecting users from their own mistakes". There may in fact be optical observers who do $\lambda_r = \lambda_o/(1+z)$ and $f_r = f_o/(1+z)$, to fit in the rest frame, not realizing that they are dealing with mixed units. But yet other (optical) astronomers might have the foresight to first correct $f\lambda = f\nu c/\lambda_0^2$. It will be impossible to deal with all the possible wrong cases, so the best approach will probably be to document the correct approach carefully. I myself I believe have "done it wrong" before. So we document, and encourage use of the redshift parameter directly.

what would provide the "least astonishment" for someone who is experienced with redshifted spectra.

I fear that the issue of mixed units will be indeed somewhat astonishing to people when they finally understand it. But we can't read what's in their mind. Maybe the best way to describe it is we implement features in an $f\lambda$ space, but measure amplitudes in an $f\nu$ space.

But I can guarantee one thing: if an observer fits the line in the observed frame, does a proper integral to get an integrated intensity, they will 100% expect that same value to obtain if the fit is instead done in the rest frame. @sarduval plans to do a short experiment testing that, just to be completely confident of our choice here.