AstroPile / FlatironMeeting2024

AstroPile meet-up at the Flatiron Institute
https://astropile.github.io/FlatironMeeting2024/
MIT License
2 stars 3 forks source link

[Data] Adding Chandra Source Catalog data to Astropile #14

Open juramaga opened 5 months ago

juramaga commented 5 months ago

Adding Chandra Source Catalog data to Astropile

This is about thinking how X-ray data can fit in the Astropile paradigm

Contacts: Rafael Martinez Galarza Participants:

Goals and deliverable

Incorporating source-based X-ray images, spectra, and light curves

Resources needed

Need experts on data formats, hugging face format [describe the resources (software, skills, data, or just enthusiasm) needed for this project]

Detailed description

Data will be from the Chandra Source Catalog (https://cxc.cfa.harvard.edu/csc/). The catalog contains tables, but also the source-based data products, such as image cutouts, spectra, and light curves. In a way, they are different modalities of the same objects. Alternatively, one could take the event files (from which all other modalities are derived), and put that in Astropile instead

juramaga commented 5 months ago

Here is a plan for the hack:

1) Download all the FITS files of cutout images for Chandra Source Catalog sources above a certain significance level (possible also within certain off-axis angle). Will probably start with a small set of very significant sources 2) Download the associated PSF files to those images 3) Potentially one could also download the spectra, light curves, and event files 4) Process the images to have the same size. Either 224px or 96px 5) Create an association of each object (image, spectra, light curve) to basic keywords for the astropile dictionary, mostly coordinates 6) Adapt the data to the astropile format. I need help here 7) What else is needed?

juramaga commented 4 months ago

Examples of accessing Chandra Catalog data:

https://github.com/juramaga/CSC2_tutorials/blob/main/CSC21_HEAD20_demo.ipynb.

juramaga commented 4 months ago

Also, here are typical images

Screen Shot 2024-03-26 at 9 52 14 AM
juramaga commented 4 months ago

Wednesday progress:

juramaga commented 4 months ago

The data format for Chandra X-ray spectra:

{ "provenance": {"catalog": "CSC", release: "2.1"}, 'object_id': 2CXO0000+000000 'observation_id': 111111, 'ra': 124., 'dec': 0., 'flux': array(500), 'flux_err': array(500), 'energ_lo': array(500), 'energy_hi': array(500), 'energy_mid': array(500), 'extra': {} }

``

juramaga commented 4 months ago

Code to query the catalog of X-ray sources:


tap = vo.dal.TAPService('http://cda.cfa.harvard.edu/csc2tap') # For CSC 2.0

qry = """
SELECT m.name, m.ra, m.dec, o.obsid, o.obi, o.region_id, o.src_cnts_aper_b,
    o.flux_significance_b, o.flux_aper_b, o.theta, o.flux_bb_aper_b,
    o.gti_mjd_obs, o.hard_hm,o.hard_hs, o.hard_ms, o.var_prob_b, 
    o.var_index_b 
FROM csc2.master_source m, csc2.master_stack_assoc a, csc2.observation_source o, 
    csc2.stack_observation_assoc b, csc2.stack_source s 
WHERE ((a.match_type = 'u') AND (o.flux_bb_aper_b IS NOT NULL) 
    AND (o.src_cnts_aper_b > 50) AND (o.flux_significance_b > 5) 
    AND (o.theta < 5)) AND (m.name = a.name) 
    AND (s.detect_stack_id = a.detect_stack_id and s.region_id = a.region_id) 
    AND (s.detect_stack_id = b.detect_stack_id and s.region_id = b.region_id) 
    AND (o.obsid = b.obsid and o.obi = b.obi and o.region_id = b.region_id)
ORDER BY name ASC
"""
cat = tap.search(qry)```
juramaga commented 4 months ago

Code to process X-ray spectra:

def processing_fn(args):

    # The argument is a path to where the spectral files live
    PATH = args

    targetids = []    # Target ID
    ener_bin_lo = []  # Low end of the energy bin
    ener_bin_hi = []  # High end of the energy bin
    ener_bin_mid = [] # Mid point of the energy bin
    fluxes = []       # Counts/sec/keV
    errors = []       # Error in count value

    # We now use Sherpa to extract the spectrum
    for file in glob.glob(PATH+'*pha*'):
        #print(file.strip().split('/')[-1][0:24])
        ui.load_pha(file)               # Load file
        ui.ignore('0.:0.5,8.0:')        # Set energy range
        ui.subtract()                   # Subtract background
        ui.group_counts(5)              # Bin counts in energy axis
        pdata = ui.get_data_plot()      # Get the object with the spectral bins
        ener_bin_lo.append(pdata.xlo)   
        ener_bin_hi.append(pdata.xhi)
        ener_bin_mid.append(pdata.x)
        fluxes.append(pdata.y)
        errors.append(pdata.yerr)
        targetids.append(str(file.strip().split('/')[-1][0:24]))

    # Return the results
    return {'TARGETID': targetids,
            'spectrum_ene_lo': ener_bin_lo, 
            'spectrum_ene_hi': ener_bin_hi, 
            'spectrum_ene': ener_bin_mid,
            'spectrum_flux': fluxes,
            'spectrum_flux_err': errors}
juramaga commented 4 months ago

Examples of X-ray Spectra: acisf05783_000N028_r0112_spectrum acisf08942_000N029_r0024_spectrum acisf26141_000N020_r0006_spectrum

juramaga commented 4 months ago

For baseline, I'd like to work on representation learning for event files, in order to find transients. Along the lines of:

https://github.com/villrv/ppae/blob/main/chandra_visualization_samelifetime.ipynb

juramaga commented 4 months ago

I have been able to create an HDF5 file with the Chandra X-ray spectra. It has the following fields:

# Identifiers
name:                     Name of the source
obsid                     Chandra observation ID  of the field containing the source
obi:                      Interval of the observation
region_id                 ID of the source detection within the observation
ra
dec

# Spectra
spectrum_ene              Mean energy of the spectral bin [keV]
spectrum_ene_lo           Lower energy end of the spectral bin [keV]
spectrum_ene_hi           Higher energy end of the spectral bin [keV]
spectrum_flux             Normalized spectral value [counts/s/keV]
spectrum_flux_err.        Error in the normalized spectral value [counts/s/keV]

# Catalog values
flux_aper_b               X-ray broad band aperture photometry flux [ergs/s/cm^2]
flux_bb_aper_b            X-ray broad band spectral flux from a blackbody fit [ergs/s/cm^2]
flux_significance_b.      S/N ratio of the detection
gti_mjd_obs               Time of the observation
hard_hm                   Hard-to-medium hardness ratio
hard_hs                   Hard-to-soft hardness ratio
hard_ms                   Medium-to-soft hardness ratio
src_cnts_aper_b           Net counts in the aperture
theta                     Off-axis angle (from the pointing position)
var_index_b               Variability index in the broad band
var_prob_b                Variability probability in the broad band