DriftingPig / Drones

0 stars 0 forks source link

Is tractor model fitting biased? #1

Closed DriftingPig closed 5 years ago

DriftingPig commented 5 years ago

Obiwan simulation shows a bias in the final magnitude measurement. In this issue I am trying to identify where this issue come from and whether there is a good way to solve it. It is important to have an unbiased model with in ~0.05. Because The target selection varies a lot in such a small difference. download

DriftingPig commented 5 years ago

Tests I am planning to do:

optimize_loop in oneblob does not seem to be stable when it's already at the best position. Running optimiaze_loop multiple times solve this issue. If this is true all the time, if I inject galaxies on an empty canvas, with the psf at that spot, I should get an unbias result. Test if this is true or not!

DriftingPig commented 5 years ago

codes get changed: kenobi.py: def get_tractor_image():

debug_hui

    #shot the image to 0
    tim.data = np.zeros(tim.data.shape)

class BuildStamp():

changed entirely to make a tractor model instead

name_for_run=chunk23_test1 RANDOMS_FROM_FITS=...ngc_randoms_per_brick.... python $obiwan_code/py/obiwan/kenobi2.py...

DriftingPig commented 5 years ago

test brick used: 1273p255

DriftingPig commented 5 years ago

http://legacysurvey.org/viewer/?brick=1273p255

DriftingPig commented 5 years ago

1st run: empty canvas, tractor model galaxies. (should I include multiple optimize_loop fitting)

DriftingPig commented 5 years ago

Unknown-6

DriftingPig commented 5 years ago

Most points are systematically lower than expceted.

DriftingPig commented 5 years ago

2st run: empty canvas, tractor model galaxies, optimize_loop for 3 times.

DriftingPig commented 5 years ago

download-3 one example of how sources looks

DriftingPig commented 5 years ago

median of z band difference is -0.0078. I think this is small enough compared with 0.045, which is the number that matters. Or maybe this brick a good brick?

DriftingPig commented 5 years ago

next step: make optimize_loop run 3 times.

DriftingPig commented 5 years ago

I tried adding up fluxes in each source. flux is slightly lower download-4

DriftingPig commented 5 years ago

low flux value targets have higher variation.

DriftingPig commented 5 years ago

median is -0.03, magnitide difference is -0.0077867507934588076.

DriftingPig commented 5 years ago

this is prettly small lol.

DriftingPig commented 5 years ago

download-5 this is interesting. This sources finally gets missing because a far enough source fitting affects it a little bit.

DriftingPig commented 5 years ago

flux difference here is 0.03.

DriftingPig commented 5 years ago

adding more looping is not making thing better download-6

DriftingPig commented 5 years ago

some sources are misclassified as psf. These sources have a systematically higher estimated flux download-7

DriftingPig commented 5 years ago

download-8 another example.

DriftingPig commented 5 years ago

I think it would be a good check of the relationship between e1,e2 and types classified as psf.

DriftingPig commented 5 years ago

download-9 same for model flux much higher than data

DriftingPig commented 5 years ago

some sources are misclassified as psf. These sources have a systematically higher estimated flux download-7

correction, lower flux here

DriftingPig commented 5 years ago

the fact that variance is much higher in obiwan estimation probably come from misclassification of galaxies?

DriftingPig commented 5 years ago

Next step: use the truth image instead

DriftingPig commented 5 years ago

Until this point, bias exists but not significant enough. So most bias happens when there is image and really, really noisy.

DriftingPig commented 5 years ago

Unknown-7 Adding real image makes thing much worse. Now mag diff is 0.074, flux diff is -0.286. This is a big enough error.

DriftingPig commented 5 years ago

(Note:rs0 zero image; rs1 zeros image with multiple optimaize_loop, rs2 real image)

DriftingPig commented 5 years ago

let's see how galaxies look in real images

DriftingPig commented 5 years ago

Unknown-8 contamination

DriftingPig commented 5 years ago

Unknown-9 super noisy

DriftingPig commented 5 years ago

It is really hard to tell where the bias come from with such noisy data. However:

  1. a possible reason could be that a lot of EXP/DEV galaxies are miscalssified as PSF. This will make the image more noisy. Previous test shows that this misclassification will also result in a higher variance in flux measurement.
  2. tractor model image is sightly better than galsim image. We should use tractor model image instead.

with this two test in hand, I can start a new production run and hopefully get a less biased sample.

DriftingPig commented 5 years ago

There is a clear trend of magnitude difference for all sources Unknown-1 However, this trend seems to get vanished after ELG selection Unknown-2

DriftingPig commented 5 years ago

difference mean for ELGs: g:0.0783443905058 r:0.0897248586019 z:0.00934192112514

DriftingPig commented 5 years ago

Sum g flux over image, real and model. at a radius of 7, and only g flux about within ELG cut. the mean here is -0.023866339. If I cut over -0.5, it is -0.019934291 Unknown-3

DriftingPig commented 5 years ago

cutting over -0.2 is -0.0090571353. This is small enough. Things different over 0.2 are probably bad fitting, as the model is not 0.2 larger than real image all the time.

DriftingPig commented 5 years ago

Those are probably bright sources around it?

DriftingPig commented 5 years ago

If I only take a look at ELGs. The difference is -0.065653620146449945, cutter over -0.2 yields -0.026987415766191441. Still large. Not sure why... Unknown-4

DriftingPig commented 5 years ago

(take a look at flux distribution of chunk23 all output)

DriftingPig commented 5 years ago

I checked several sources whose flux difference is greater than 0.2, they are all contaminated. download

DriftingPig commented 5 years ago

for real ELGs, if I make a cut over +/-0.2, the flux difference goes to -0.05. Still bad...? download-1

DriftingPig commented 5 years ago

I think it might be necessary to check the variance made by the contamination around it. I feel certain the mask is not good enough. I'm not sure if it's possible, but we should add the variance to the fitted faint source made by sources around it.

Another thing is that I should try tractor model instead of galsim model. The tractor model behaves good enough in the test brick. E.G. Error is tolerable after cutting off flux difference smaller than -0.2. Let's see if the extra bias comes from galsim model.

DriftingPig commented 5 years ago

Next step: do a production run to finish ~50 bricks with tractor model. Using debug node.

DriftingPig commented 5 years ago

The simulated catalogue cutting over +/-0.2 is 0.07. So maybe the fault is galsim?

DriftingPig commented 5 years ago

bricklists in: /global/cscratch1/sd/huikong/obiwan_Aug/repos_for_docker/obiwan_code/py/obiwan/Drones/obiwan_analysis/preprocess/brickstat/chunk23_test1/UnfinishedBricks.txt

DriftingPig commented 5 years ago

[new run] elg_ngc_new_run, chunk23_0719_per_brick, mpi4py_run

DriftingPig commented 5 years ago

this is a plot of input-output vs output g flux plot. It shows that when g flux is small, there is a (somehow) linear cut that cut off targets with low flux Unknown-5

DriftingPig commented 5 years ago

this is a plot of input-output vs output g flux plot. It shows that when g flux is small, there is a (somehow) linear cut that cut off targets with low flux Unknown-5

this shows that flux lower than ~0.65 gets missing. Which should be because they are below the detection threshold

DriftingPig commented 5 years ago

And if I calculate the flux difference mean above 0.75, the mag diff goes to 0.03, much better.

DriftingPig commented 5 years ago

So there should be lower limit of g,r,z flux in the code, below this limit, the total number counts is below detection threshold.