jobovy / apogee

Tools for dealing with APOGEE data
BSD 3-Clause "New" or "Revised" License
43 stars 25 forks source link

sparsify : wrong sign #30

Closed tingyuansen closed 8 years ago

tingyuansen commented 8 years ago

Hi Jo,

This is a minor error because the convolution matrix is ~ symmetric for a small wavelength range, but I think:

in apogee/lsf.py

def sparsify(lsf): ... offsets.append(offset) ...

should be

    offsets.append(-offset)

I tested a few cases.

jobovy commented 8 years ago

So what you are saying is that I flipped the LSF around 0 offset? I think I thought this through when I coded it up and convinced myself that this was the correct. Looking at the code again, I still think it's correct.

For the basic grid which has the LSF at 43 points [-21,-20,...,20,21], the first through the loop goes as follows

Does this make sense? What kinds of tests did you run?

tingyuansen commented 8 years ago

Hey Jo,

I thought about it. But I think the scenario you depicted is only true if the kernel matrix is completely symmetric. In our case, since the kernel is wavelength-dependent, even you are taking the right entry that is n steps away, the kernel size might be wrong. I might be missing something here, but here is the test that I made (here I flipped the offset sign to get the, in my opinion, correct answer.

## sparsify matrix
def sparsify(lsf):
    nx= lsf.shape[1]
    diagonals= []
    offsets= []
    for ii in range(nx):
        offset= nx//2-ii
        offsets.append(-offset)
        if offset < 0:
            diagonals.append(lsf[:offset,ii])
        else:
            diagonals.append(lsf[offset:,ii])
    return sparse.diags(diagonals,offsets)

## make wavelength grid
wavelength_run = np.arange(15000,17000,3)
R_res = 1000

## pad wavelength
R_range = 1
wavelength_tmp = np.concatenate([np.zeros(R_range),wavelength_run,np.zeros(R_range)])

conv_matrix = np.zeros((wavelength_run.shape[0],2*R_range+1))
for w1 in range(wavelength_run.shape[0]):
    conv_matrix[w1,:] = norm.pdf(wavelength_tmp[w1:w1+2*R_range+1]-wavelength_tmp[w1+R_range],\
                                 scale=wavelength_tmp[w1+R_range]/R_res/2.355)
conv_sparse = sparsify(conv_matrix)

print conv_sparse.toarray()
print conv_matrix

Am I missing something here? I also tried that using the -minus sign gives the exact results when I compared to the most naive convolution method (just run through each wavelength bin one-by-one), whereas the original one gives a small offset.

Btw, as I am trying to read through your codes, I am a bit confused by how sigma_vmacro is treated, there is this term:

sigvm= vmacro/3./10.**5./numpy.log(10.)*hires/dowav/2./numpy.sqrt(2.*numpy.log(2.))

I am not sure about the numpy.log(10.)*hires/dowav. I think this might be related to the spacing of wavelength is log-linear instead of linear, but it still seems a bit weird to me...

jobovy commented 8 years ago

I think this is because we have a different understanding of the LSF. At the end of your procedure you get

print conv_matrix[0:2]
array([[ 0.        ,  0.06263394,  0.05605799],
       [ 0.05604927,  0.06262141,  0.05604927]])

the second row here specifies how the light in true pixel 1 is spread over pixels 0,1,2 by the LSF: 0.05604927 of true pixel 1's flux goes into pixel 0 (true means what would happen if the LSF were infinitely narrow). [note that each row should add to 1 here, but it doesn't].

When I print your sparse matrix representation, I get

print conv_sparse.toarray()[0,:5]
[ 0.06263394  0.05605799  0.          0.          0.        ]

Now, I believe that the second element here should be 0.05604927, because when you do LSF x unconvolved-hires-spectrum you multiply this row by the high-res spectrum and so the second contribution is (fraction-of-pixel1-that-lands-in-pixel0)x(true-flux-in-pixel1). My routine makes this happen:

import apogee.spec.lsf as aplsf
conv_sparse_jo= aplsf.sparsify(conv_matrix)
print conv_sparse_jo.toarray()[0,:5]
[ 0.06263394  0.05604927  0.          0.          0.        ]

I think you are saying that the LSF centered on a pixel (each row in conv_matrix) specifies how much light comes from adjacent pixels, while I'm saying that it is how much light is spread over adjacent pixels.

Yes, the numpy.log(10.)*hires/dowav. factor comes from the fact that the spacing is linear in log10, so it just comes from the Jacobian.

tingyuansen commented 8 years ago

Oh, I think you are right. Indeed, going from high res to low res I should think that each row is how much light is spread over instead of coming from. Ouch, sorry!

One more question regarding the Jacobian, if I understand correctly, when we do sum_lambda f(lambda) Kernel(lambda), you have an extra term of sum_lambda f(lambda) Kernel(lambda) * exp(delta ln lambda). But I am thinking exp(delta ln lambda) is not delta lambda, which is what we need to weight for the jacobian, isn't it?

I am probably missing something obvious here =( .

Cheers, YS

jobovy commented 8 years ago

Just think of it as transforming Gaussian distributions assuming that the transformations are close to linear, for which you go from \sigma_x1 to \sigma_x2 = \sigma_1 |d x2 / d x1|.

So you go from a Gaussian in velocity with sig_macro = vmacro/2./numpy.sqrt(2.*numpy.log(2.)) to an approximate Gaussian in delta log_10(lambda). First you go to a Gaussian in delta lambda, using delta lambda / lambda = v / c --> sig_lam = \sig_macro x lam / c (lam ~ constant as long as v is small). Then you go from x1=delta lambda to x2=delta log_10(lambda). |dx2/dx1| = d (delta log_10(lambda))/d delta lambda) = 1/ delta lambda / ln(10). So the end result is sig_dlog10lam = sig_macro/c x lam / delta lam / ln(10), which is the expression in the code. Note that the units of this sig_dlog10lam are correct.

tingyuansen commented 8 years ago

This is neat! Thank you!