desihub / desisim

DESI simulations
BSD 3-Clause "New" or "Revised" License
16 stars 22 forks source link

Create version/option in quickquasars to generate Lyman alpha mocks for eBOSS #437

Open andreufont opened 5 years ago

andreufont commented 5 years ago

We would like to use quickquasars to generate Lyman alpha mocks for eBOSS. That would be useful for eBOSS, but it would also increase the number of eyes and hands on quickquasars and the London mocks.

@belaa already has a version of specsim that works on eBOSS specs, we just need to work on the rest of quickquasars (the easy part). Today I only started to think about it, but I plan to work on this tomorrow (wifi permitting).

The first thing we need to decided is: do we want --eBOSS to be an option of quickquasars, or do we want to have a separate script quickquasars-eBOSS.py ?

The advantage of the first would be that we don't have duplicated code, but it could potentially make our beloved DESI script a bit messier...

I believe the mess would be quite localized though:

Ok, it might be more work than I thought. We probably want to have a different script.

@julienguy @moustakas @sbailey @dkirkby @ngbusca

ngbusca commented 5 years ago

I vote for having a --eboss (all lowercase) option, rather than duplicating code. There's not reason for it to become messier just because of the extra option. What would require more thought is how to get the eboss+boss sky density, but I don't think we absolutely need this from the get-go as long as we have the right mean density.

For the geometric part of the footprint we will need external data (e.g. the list of boss+eboss tiles), (would we add that to a desi repository?) @kdawson1000 might know where to get those input files.

kdawson1000 commented 5 years ago

Nicolas, There are several ways that we can track the eBOSS footprint. I will try to find you in person so we can settle on the easiest way.

julienguy commented 5 years ago

I would vote for

The reason is it would be problematic to force new version of desisim (which is the main DESI simulation repo) each time we make an improvement in the eboss sim. But I agree a single quickquasars code used for both projects would be better.

ngbusca commented 5 years ago

We worked quite a bit yesterday with @belaa it seems that the changes are quite minimal:

andreufont commented 5 years ago

I agree it would be better to have a minimal change in desisim.

However, I agree with Julien that we should also have a Lyman alpha repo where we can play with freely (for instance, for mini-test that Julian is working on).

andreufont commented 5 years ago

I started a branch to do that, but if someone else is already working on a different branch I can move the couple of lines there.

https://github.com/desihub/desisim/tree/eboss-quickquasars

ngbusca commented 5 years ago

After taking a closer look at the desisim code, I've encountered a few more blocking points to the point that I think we should actually decouple from desisim altogether and interface directly with specsim. @belaa could you point us to an example on how given a noiseless, high resolution eboss spectrum we can obtain a noisy version of it using specsim? @dkirkby mentioned that this could be even done as a function of plate/mjd, is that the case?

belaa commented 5 years ago

@ngbusca Here is a notebook on my eboss_config branch that goes through how to simulate eBOSS spectra. As far as I know, I think you have to provide the fluxes for each source you're using either in the config file or as an argument to Simulator.simulate(), so I don't think it's as easy as just giving a plate/mjd - but maybe @dkirkby can confirm this.

dkirkby commented 5 years ago

@belaa is correct that the true spectrum needs to be specified for each simulated fiber.

@ngbusca: what properties for a given plate/mjd do you want to reproduce? (sky, read noise, wlen dispersion, targets, ...)

ngbusca commented 5 years ago

@dkirkby @belaa to zeroth order we would like to get the right s/n, sky, etc. on coadds (spPlates) on average (I think we already have that, right?). But a more realistic way to do it could be: a. for each simulated quasar get its RA,DEC and calculate the plate (or plates) were it would have been observed b. assign to it random fibre (such that the fiber isn't already assigned to another quasar) c. use the info available for this plate/mjd/fiberid (exptime, focal XY, moon, atmospheric depth, ...) to add read noise, sky, etc.

I'm still not sure how a. would interplay with simulating quasar reobservations (ideas?).

Would c. add a lot of overhead? would c. add correlated sky residuals? The zeroth order might be enough by the way.

andreufont commented 5 years ago

Do we really need this, Nico? It sounds quite complicated, specially when taking into account reobservations (and how these targest were selected).

For now, I would work under the assumption that all spectra in DR14Q are random draws from the same SNR distribution, and that the only difference between eBOSS and BOSS_minus_eBOSS footprint is the density of high-z quasars. But it would be great to see these distributions of SNR in BOSS, in eBOSS, for the reobserved quasars...

ngbusca commented 5 years ago

@andreufont I totally agree, that's my order zeroth above and we should definitely try it out and check the SNR, etc. before complicating things. I was just asking looking ahead, certainly not requesting anything new from specsim!

dkirkby commented 5 years ago

Here are some notes as I try to understand the layers of quickquasars. Corrections welcome.

The nested calling sequence is:

desisim.scripts.quickquasars.simulate_one_healpix
  -> desisim.scripts.quickspectra.sim_spectra
    -> desisim.simexp.simulate_spectra
      -> specsim.simulator.Simulator.simulate

At the lowest level, in desisim.simexp.simulate_spectra, specsim is used as follows:

    desi = desisim.specsim.get_simulator(config, num_fibers=nspec,
        camera_output=psfconvolve)
    ...
    desi.atmosphere.seeing_fwhm_ref = obsconditions['SEEING'] * u.arcsec
    desi.observation.exposure_time = obsconditions['EXPTIME'] * u.s
    desi.atmosphere.airmass = obsconditions['AIRMASS']
    desi.atmosphere.moon.moon_phase = np.arccos(2*obsconditions['MOONFRAC']-1)/np.pi
    desi.atmosphere.moon.moon_zenith = (90 - obsconditions['MOONALT']) * u.deg
    desi.atmosphere.moon.separation_angle = obsconditions['MOONSEP'] * u.deg
    ...
    desi.observation.exposure_start = astropy.time.Time(obsconditions['MJD'], format='mjd')
    ...
    desi.instrument.fiberloss_method = 'fastsim'
    ...
    desi.simulate(source_fluxes=flux, focal_positions=xy, source_types=source_types,
                  source_fraction=source_fraction,
                  source_half_light_radius=source_half_light_radius,
                  source_minor_major_axis_ratio=source_minor_major_axis_ratio,
                  source_position_angle=source_position_angle)

However, the following top-level call, from desisim.scripts.quickquasars.simulate_one_healpix, already contains most of the info needed by desi.simulate:

    sim_spectra(qso_wave,qso_flux, args.program, obsconditions=obsconditions,spectra_filename=ofilename,
                sourcetype="qso", skyerr=args.skyerr,ra=metadata["RA"],dec=metadata["DEC"],targetid=targetid,
                meta=specmeta,seed=seed,fibermap_columns=fibermap_columns,use_poisson=False)

What output files do you need for the eboss mode? Are you only simulating quasars, or also contaminants?

ngbusca commented 5 years ago

I tried pushing as far as I could, and eventually I got an error claiming that the requested X,Y position of the fiber was outside the FOV of the telescope. I think this is related to positioning the fibers in the focal plane according to Mayall but trying to simulate them with SDSS, which is why I suggested to give up using desisim altogether...

dkirkby commented 5 years ago

Many of the steps being performed now are not relevant for quasar sims (e.g. setting moon conditions and galaxy size & shape). That's not necessarily a problem, unless it brings in a lot of hardcoded desi assumptions (like the field of view).

andreufont commented 5 years ago

I think it's fair to say that we (Julien?) Copy pasted code to get things running, but if you can identify code that is not relevant feel free to remove it :-)

andreufont commented 5 years ago

Nicolas, There are several ways that we can track the eBOSS footprint. I will try to find you in person so we can settle on the easiest way.

@kdawson1000 - Any progress on this? A part from the issues with the noise simulator, we also need to address the footprint discussion. Could you provide a function / file that defines the footprint for BOSS and for eBOSS?

kdawson1000 commented 5 years ago

I thought Nicolas had a solution to this by simply using the DR16Q catalog.

andreufont commented 5 years ago

Yes, I forgot about this. @ngbusca , let me know if you would like to discuss this (on skype?) or whether you already have a good plan for how to proceed with the eBOSS mocks.

londumas commented 5 years ago

@andreufont, Could I be of any help to have this moving?

belaa commented 5 years ago

An update on my end: I got quickquasars to run on the eboss specsim config, although I'm not sure how to interpret the output. I get three FITS files:

spectra-16-333.fits truth-16-333.fits zbest-16-333.fits

I had to tweak some things in the specsim config file to get it to run which I'm trying to understand, but I was able to run it without any errors.

londumas commented 5 years ago

@belaa, I be happy to have a look at these files. Could you tell me where you ran quickquasar and where are the output files? Could you give me reading permissions?

belaa commented 5 years ago

@londumas I've been running it on my computer but have moved the output files here:

/global/cscratch1/sd/belaa/desi/quickquasar_output

I think you should have reading permissions but let me know if you can't access it for some reason.

londumas commented 5 years ago

@belaa, I don't have permissions. could you run fix_permissions.sh?

belaa commented 5 years ago

@londumas Ok I ran it and it should work now

londumas commented 5 years ago

@belaa, Could you make sure you ran on the first level of the directories? i.e on /global/cscratch1/sd/belaa/? I still don't have permissions.

belaa commented 5 years ago

@londumas Hi Helion, I'm sorry, does it work now?

londumas commented 5 years ago

@belaa, Yes it does. I will have a look at it.

londumas commented 5 years ago

@belaa, everything looks good for the moment. I have plotted a spectrum, it looks good with only two spectrographs instead of three in DESI. However, the flux doesn't go down to 3600 A. Do you know why? The same as for the upper value, it doesn't go above 10,000 A

Could you run on way more (~200,000) so I can look at the footprint, at the noise level, at the magnitude distribution...

andreufont commented 5 years ago

Hi @londumas - there are two things that need to be done:

londumas commented 5 years ago

@andreufont, is it this one https://github.com/desihub/desisim/tree/eboss-quickquasars? If yes, it seems that it will take me a bit of time to get into it since I never had to work with this part of the code. Would you have the time to work on it?

andreufont commented 5 years ago

Hi @londumas, yes, I can take care of the code. But it would be helpful if you could generate a density map of the DR14Q catalog in HEALpix pixels, that I can use to set the footprint

londumas commented 5 years ago

@andreufont, this is very easy to do. What nside? Do you have an example for desi, so I can get the format right?

ngbusca commented 5 years ago

@andreufont @londumas is the idea to use the density map as a PDF for the simulated quasar angular positions? it an issue that we might be imprinting some of the clustering in the real-data sample?

andreufont commented 5 years ago

@ngbusca @londumas - It is not clear what the best strategy is, to be honest.

For DESI mocks, in quickquasars we use Nside=256, as shown in lines 665 / 666 of quickquasars: footprint_healpix_nside=256 # same resolution as original map so we don't loose anything footprint_healpix_weight = load_pixweight(footprint_healpix_nside, pixmap=pixmap)

But that is for the footprint, it does not capture information about the density of quasars.

I agree with Nico that if we use large values of Nside (small pixels) we might imprint some clustering in the mocks, but remember that most of the fluctuations are just shot noise... and in any case it would only affect the quasar autocorrelation.

We could avoid that if we used a smaller value of Nside (larger pixels), but then the foorprint would not be very well resolved. What is probably fine?

May be we can use a high-resolution HEALPix map as a mask, that we apply on top of the low-resolution density map?

An alternative option is to not use quasar density information, and just use the footprint "mask" as we do in DESI mocks. We can then set by hand a density of 16 quasars / sq.deg. in BOSS, and 24 in BOSS+eBOSS (or whatever numbers we have in DR14Q now).

Let's just start with whatever is easier, so that we have something before the Paris meeting.

londumas commented 5 years ago

@andreufont, Yes I agree, it will only imprint correlations on the auto-correlation of quasars. Could you tell me what type of formt you need? Is fits file ok? Something like healpix index, density.

andreufont commented 5 years ago

Hi @londumas , I think the easiest thing would be if you added a function in desisim/py/desisim/eboss.py that reads whatever file you generate, and returns the density of quasars in DR14Q as a function of (ra,dec). You can find a placeholder there, that you would just need to substitute. I believe the rest of the code would already work.

https://github.com/desihub/desisim/blob/eboss-quickquasars/py/desisim/eboss.py

andreufont commented 5 years ago

Hi @londumas - I've added a text file in desisim/py/desisim with the BOSS/eBOSS footprint, and it seems to work fine. You can run quickquasars with --eboss and it downsamples the skewers to match the density of DR14Q.

It is a temporary solution, someone that knows better how desihub works should come up with a more stable solution before asking to merge.

I also don't have the DESI code setup at NERSC (only the master branches), but it would be great if someone could try to run it on all london/v4.0 mocks to generate a first eBOSS mock. While we wait for the proper eBOSS version of specsim (or simspec?) we could just run with --exptime 1000 and it should give SNR in the right ballpark.

londumas commented 5 years ago

@andreufont, Thank you very much. This is very helpful. Did you run it somewhere on Nersc? Is the inhomogenieties in the footprint of eBOSS also taken care of, i.e. the high density in Sequels...

andreufont commented 5 years ago

Hi @londumas - This is the footprint file that I added. I only use different colors to distinguish three regimes, I always struggle with imshow :-(

dens_dr14q

I'm transferring london/v4.0 to my desktop, but it will take several hours... Meanwhile I have tried with several hundred HEALPix pixels and it seems to work.

I can not try at NERSC since at this point I can only use the master branches at NERSC.

andreufont commented 5 years ago

I could try to remove the DR7 quasars, and I could try to smooth the density fluctuations... but I'd rather let someone else do this.

londumas commented 5 years ago

@andreufont, I will take care of this.

andreufont commented 5 years ago

@londumas - if you go to NERSC, and chekcout the eboss-quickquasars branch of desisim, you should be able to generate to use the desisim/etc/quickquasar_slurm_script.sh to generate eBOSS mocks with only the following changes:

outdir=/project/projectdirs/desi/mocks/lya_forest/london/v4.0/eboss-0.0/

command="srun -N 1 -n 1 -c $nthreads quickquasars -i $tfiles --nproc $nthreads --outdir $outdir/spectra-16 --zbest --mags --eboss --seed $seed"

If you wanted we could have a quick skype/slack chat.

londumas commented 5 years ago

@andreufont, Perfect, thank you very much. Let me try this first today when I find the time. Let's call if I fail to do it.

londumas commented 5 years ago

@andreufont, I managed to run quickquasar. Some transmission files did not work. I wonder if it is not because of the error message at the end: BASS-MzLS-BASS-MzLS not a valid photo system. @julienguy do you know anything about that? The test script at then end gives the same error message.

Data: eboss_data

Mocks: eboss_mocks

INFO:quickquasars.py:256:simulate_one_healpix: Select QSOs in BOSS+eBOSS footprint 2438 -> 0
WARNING:quickquasars.py:258:simulate_one_healpix: No intersection with BOSS+eBOSS footprint
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "<me>/Programs/imcgreer/simqso/simqso/sqphoto.py", line 316, in load_photo_map
    photSys = supported_photo_systems[photSysName][survey]
KeyError: 'BASS-MzLS'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/global/common/software/desi/edison/desiconda/20180709-1.2.6-spec/conda/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/global/common/software/desi/edison/desiconda/20180709-1.2.6-spec/conda/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "<me>/Programs/desi/code/desisim/py/desisim/scripts/quickquasars.py", line 629, in _func
    return simulate_one_healpix(**arg)
  File "<me>/Programs/desi/code/desisim/py/desisim/scripts/quickquasars.py", line 421, in simulate_one_healpix
    noresample=True, seed=seed, south=issouth)
  File "<me>/Programs/desi/code/desisim/py/desisim/templates.py", line 2725, in make_templates
    nocolorcuts=nocolorcuts, noresample=noresample, south=south)
  File "<me>/Programs/desi/code/desisim/py/desisim/templates.py", line 2493, in _make_simqso_templates
    qsometa.loadPhotoMap([('BASS-MzLS', 'BASS-MzLS'), ('WISE', 'AllWISE')])
  File "<me>/Programs/imcgreer/simqso/simqso/sqgrids.py", line 1000, in loadPhotoMap
    self.photoMap = sqphoto.load_photo_map(photoSys)
  File "<me>/Programs/imcgreer/simqso/simqso/sqphoto.py", line 319, in load_photo_map
    (photSysName,survey))
ValueError: BASS-MzLS-BASS-MzLS not a valid photo system
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<me>/Programs/desi//code//desisim/bin/quickquasars", line 9, in <module>
    sys.exit(main())
  File "<me>/Programs/desi/code/desisim/py/desisim/scripts/quickquasars.py", line 726, in main
    pool.map(_func, func_args)
  File "/global/common/software/desi/edison/desiconda/20180709-1.2.6-spec/conda/lib/python3.6/multiprocessing/pool.py", line 266, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/global/common/software/desi/edison/desiconda/20180709-1.2.6-spec/conda/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
ValueError: BASS-MzLS-BASS-MzLS not a valid photo system
srun: error: nid00033: task 0: Exited with exit code 1
srun: Terminating job step 11349230.20
#!/bin/bash

downsampling=0.40
idir=/project/projectdirs/desi/mocks/lya_forest/london/v4.0/
outdir=/project/projectdirs/desi/mocks/lya_forest/london/v4.0/eboss-0.0/test-0-0/

if [ ! -d $outdir ] ; then
    mkdir -p $outdir
fi
if [ ! -d $outdir/logs ] ; then
    mkdir -p $outdir/logs
fi
if [ ! -d $outdir/spectra-16 ] ; then
    mkdir -p $outdir/spectra-16
fi

#file=/project/projectdirs/desi/mocks/lya_forest/london/v4.0/0/0/transmission-16-0.fits
file=/project/projectdirs/desi/mocks/lya_forest/london/v4.0//1/137/transmission-16-137.fits

quickquasars -i $file --outdir $outdir/spectra-16 --zbest --mags --eboss --seed 123
londumas commented 5 years ago

@julienguy and @andreufont, sory for the noise. I simply had to update SIMQSO to master. And it works now.

moustakas commented 5 years ago

Tag v1.2.3 of simqso would also have worked: https://github.com/imcgreer/simqso/releases/tag/v1.2.3

londumas commented 5 years ago

@andreufont, I have rerun quickquasar on the branch eboss-quickquasars: /project/projectdirs/desi/mocks/lya_forest/london/v4.0/eboss-0.0. For the moment, everything looks good: footprint, SNR, magnitude, ... I will run do_deltas.py tomorrow and the correlation functions.

Here are things we could work on:

Redshift distribution redshift_distribution

Footprint data footprint_data

Footprint mocks footprint_mocks

belaa commented 5 years ago

@londumas Sorry for the lag - my computer was out of commission for a couple days. I have looked into the code and think I may know why it is showing a limited spectral range, but need to investigate this further. As for running this on more realizations, I'm afraid I am still unfamiliar with the details of how quickquasars works - is each quasar specified by a transmission file? And if so, can you point me to where these are located?

andreufont commented 5 years ago

Hi @londumas - I'm glad it worked, and very curious to see how the correlation/BAO looks like!

A couple of comments on your comments:

H) downsampling in redshift according to the DR14Q n(z) A) there are two options. We could run SDSS-only CoLoRe boxes, but it would be better if we could recycle the DESI boxes. I agree with you that for this the easiest would be to find some sort of function p(z) that we could use to downsample the quasars in a redshift-dependent way to match the distribution in SDSS.

H) Work better on the footprint, the high density of SEQUELS doesn't appear in the mocks A) Well, here it gets tricky. The code assumes that the selection function can be split into a function of redshift, n(z), and an angular selection (density as a function of angular position). However, in SDSS this is not true, since the n(z) changes a lot from BOSS to eBOSS, with the latter having a lot more low-z quasars. This means that the plots of the footprint are highly dependent on the minimum redshift cut used. I used zmin=2.15 to define what I consider the footprint, since this cut is probably the sensible decision for Lyman alpha auto-correlation. With this cut, the SEQUELS area is not that different, I believe... What zmin did you use? In any case, feel free to add an alternative footprint file!

H) The maximum lambda_obs is lower in mocks than in data. In eBOSS the pixels are binned in log lambda, in these mocks they are binned in lambda. A) I would decouple these two tasks. My branch (and the PR I'll submit soon) only tackles the density of quasars and the footprint, while instrument-specific settings should be addressed in a different branch. This is what @belaa is working on, to simulate the eBOSS instrument response. Once she has that part of the code running it should be easy to combine.