Closed mservillat closed 5 months ago
2.2.1 Chandra --> added to the text
3.1.2 I rephrased the "3 main levels" to be more flexible, including some of the text given here. The objective is to find (if possible) genral data processing steps for all HE data. The idea is that an event-bundle correspond to this package that include all products needed for data analysis. This is just a proposition for now, and we may need to add some precisions on the possible content (a topic for a HEIG !)
3.1.3 Ok
3.1.4 indeed, here we state that we have Stable and Good time interval, then how to generalize this for all HE data is yet to be discussed... this can lead to different multiplicities in 6.3 Fig 1.
3.1.6 yes, the text is soften to enlarge the possibilities for now. Note that the proposed definition of event-list was a bit less restrictive.
3.4 Noted, the text is not yet complete here. It would be interesting to mention a few examples of software, like CIAO, SAS, gammapy... work in progress
6.1 this is interesting, some text is included, and the idea is to present the event-bundle as a package with all the necessary products to analyze the event-list
6.2.1 This section could indeed develop the idea of different instrument or data modes. The Chandra examples can have similar concepts, and names could be define (but not yet in this document).
6.2.2 yes, tmin tmax, may be to general, and GTIs may not help (complex to handle). Maybe T-MOC could be proposed here (for data discovery purpose).
6.2.3 this part is definitely to be further discussed and ordered, I am not sure it answers some use cases, maybe this is going a bit too far.
6.3 indeed, the multiplicities and content of an event are to be updated according to the new text here (I add notes while the figure is not updated). Hopefully, we can still manage to have a global structure for most the the HE/VHE facilities. Maybe energy could be an intensity, or something more general (observable, signal...), to be further discussed ! and the idea is for this to be a topic for a HEIG ;-)
Update for section 2.2.1:
2.2.1 Chandra
Part of NASA’s fleet of "Great Observatories", the Chandra X-ray Observatory (CXO) was launched in 1999 to observe the soft X-ray universe in the 0.1 to 10 keV energy band. Chandra is a guest observer, pointed-observation mission and obtains roughly 800 observations per year using the Advanced CCD Imaging Spectrometer (ACIS) and High Resolution Camera (HRC) instruments. Chandra provides high angular resolution with a sub-arcsecond on-axis point spread function (PSF), a field of view up to several hundred square arcminutes, and a low instrumental background. The Chandra PSF varies with X-ray energy and significantly with off-axis angle, increasing to R50 ~25 arcsec at the edge of the field of view. A pair of transmission gratings can be inserted into the X-ray beam to provide dispersed spectra with E/DeltaE ~1000 for bright sources. The Chandra spacecraft normally dithers in a Lissajous pattern on the sky while taking data, and this motion must be removed from the time-resolved X-ray event lists when constructing X-ray images using the motion of optical guide stars tracked by the Aspect camera.
The Chandra X-ray Center (CXC) processes the spacecraft data through a set of Standard Data Processing Level 0 through Level 2 pipelines. These pipelines perform numerous steps including decommutating the telemetry data, applying instrument calibrations (e.g., detector geometric, time-dependent gain, and CCD charge transfer efficiency [CTI] corrections, bad and hot pixel flagging), computing and applying the time-resolved Aspect solution to de-dither the motion of the telescope, identifying good time intervals (GTIs), and finally filtering out bad times and X-ray events with bad status. All data products are archived in the Chandra Data Archive (CDA) in FITS format following HEASARC OGIP standards. The CDA manages the proprietary data period (currently 6 months, after which the data become public) and provides dedicated interactive and IVOA-compliant interfaces to locate and download datasets.
The CXC also provides the Chandra Source Catalog, which in the latest release (2.1) includes data for ~407K unique X-ray sources on the sky and more than 2.1 million individual detections and photometric upper limits. For each X-ray source abd detection, the catalog provides a detailed set of more than 100 tabulated positional, spatial, photometric, spectral, and temporal properties. An extensive selection of individual observation, stacked-observation, detection region, and master source FITS data products (e.g., RMFs, ARFs, PSFs, spectra, light curves, aperture photometry MPDFs) are also provided that are directly usable for further detailed scientific analysis.
Finally, the CXC distributes the CIAO data analysis package to allow users to recalibrate and analyze their data. A key aspect of CIAO is to provide users the ability to create instrument responses (RMFs, ARFs, PSFs, instrument and exposure maps, etc.) for their observations using their choice of spectral models and weightings. The Sherpa modeling and fitting package supports N-dimensional model fitting and optimization in Python, and supports advanced Bayesian Markov chain Monte Carlo analyses.
Comments on other sections (reading through the document sequentially):
3.1.2
Note that definition of data levels can vary significantly from facility to facility (and may not neatly map to separate ObsCore calib_levels).
For example, for Chandra standard data processing:
L0 data are decommutated telemetry split by spacecraft subsystem and recorded as FITS binary tables. For the instruments, these event lists have a time assigned and detector (image pixel) coordinates but no other calibrations are applied. Good time intervals are based solely on lost (A/G) data.
L1 data have the Aspect solution applied so have several additional coordinate systems, including a mapping to world coordinates. Other calibrations that are applied include detector gain, charge transfer inefficiency (CTI) correction, bad and hot pixel identification, event grade determination, and pulse-invariant pulse height determination. Updated good time intervals and event status are determined but not applied to the event list.
L2 data are similar to L1 data but filters are applied to remove bad time intervals and events with bad status.
None of these levels have instrument responses (RMF and ARF) applied, as that is too dependent on assumptions about the source spectrum and weighting as the source dithers across the detector chip(s). We don't construct images or light curves as standard products (although this is done as part of L3 catalog processing where many additional data products types are created on a detection or source basis).
For observations that use the transmission gratings, grating data products (assign spectral order and wavelength to each event) are created in an intermediate L1.5.
3.1.3
"The characterization and estimation of this background is particularly important ...". I would say "... may be particularly important ...".
While we do from time to time have intervals of flaring background due to the orbital radiation environment, the detector background on Chandra is typically very low (and Chandra resolves the cosmological background due to AGN). So the background may not be particularly significant for many science cases for Chandra observations.
3.1.4 (added after reading the entire document)
I think there is an inherent (but perhaps unintentional) implication here that STIs and GTIs are just different names for the same concept. However (see my comments under 6.3) I think they are different and represent different concepts.
3.1.6
While distributing event data and associated IRFs together may be efficient for some projects, I'm not convinced that it is a good choice for all. For Chandra, we leave it to the end user to compute responses using tools provided with the CIAO data analysis package. This is done because the responses depend on the assumed source spectral model and also on the dither history of the source on the detector during the time interval that the observer is interested in (to properly weight the responses - which vary with position on the ACIS CCDs - as a function of source position on the detector).
"... at least time, sky position, energy ..."
I think this is too restrictive. Depending on level, Chandra event lists may not have estimated physical parameters for "sky position" or "energy". They will typically have a position for each event in some coordinate system, but not necessarily a "sky position", and a measure of pulse height, but not necessarily a physical "energy".
3.4
"CiaO" --> "CIAO" (Chandra Interactive Analysis of Observations). CIAO is not specific to Chandra, and includes a number of general purpose tools that can be (and have been) used with data from other facilities (XMM, Suzaki are a couple I recall). However CIAO is needed for Chandra data since it includes tools specifically for recalibrating Chandra instrument data and computing Chandra responses.
6.1
Chandra data products are split into 3 "packages": primary, secondary, and supporting.
Primary products include around half a dozen different types of products necessary to analyze Chandra data (for example, L2 event list, PHA spectrum, Aspect solution, bad pixel map, spacecraft ephemeris, V&V Report).
Secondary products add in another 15 or so data products that are needed to recalibrate the data with updated calibrations (this is typically because several "time-dependent" calibrations can nly be updated to their final values every ~3-6 months based on calibration data statistics).
Supporting products include well over a hundred additional products, including L0, that are generally not used by end-users.
The Chandra Data Archive by default distributes primary and secondary packages (as tar files), because that is what most end users want for their analysis.
I'm unclear whether such packages would meet the precise definition of an event-bundle? We have typically discussed an event-bundle in terms of an event-list plus IRFs, but as previously discussed this does not really make sense for Chandra. However, for Chandra, the primary (or primary and secondary) packages do make sense as a bundle because they provide the ancillary data products needed to compute responses or recalibrate the data. If we restrict the "event-bundle" to mean an event-list plus responses, then I think we need a concept for the analog of the Chandra packages that include other products needed for data analysis.
6.2.1
The suggested expansion of the obs_collection concept would be very beneficial and address several issues raised in the past by Arnold Rots, e.g., observation type or class (e.g., calibration). Different instrument or data modes for the same instrument may significantly alter the data coming from an instrument. For example for the Chandra ACIS instrument, TIMED readout mode produces event lists with 2-dimensional image coordinates, whereas CONTINUOUS-CLOCKING readout mode produces event lists with 1-dimensional image coordinates and the 2nd dimension is effectively collapsed.
The definition of s_ra, s_dec as the "center of the observation" is ambiguous. For Chandra: (1) The Chandra Source Catalog creates region event lists in a field cutout surrounding each detected source. The center of the observation/telescope pointing coordinates would not be included within the field of the vast majority of these event lists, so these would not be useful for identifying such data products using ObsCore. (2) Similarly, because the spatial extent of the Chandra PSF varies dramatically across the field, this definition does not work well for far off-axis observations (like many calibration observations) where the telescope pointing coordinates would be outside the field of the observation.
Would all high energy astrophysics facilities use the same energy scale for em_min, em_max which range from keV to TeV? Should we standardize on eV as a base unit?
6.2.2
Particularly for many Chandra Source Catalog stacked (co-added) observation data products, the current definition of t_min, t_max may not be very useful for identifying data products, as the stack start-to-stop times may span an interval of many years, even though the actual data coverage is only a few Ms.
On the other hand, I'm not sure how useful GTIs would be here, at least in the Chandra case. Typically there are a few or a few dozen GTIs in a single observation (and most bad time gaps are small), but there can be many thousands. This is particularly the case for observations taken in an instrument mode that results in telemetry saturation (there are sometimes reasons one would want to do this).
6.2.3
Facilities that follow the HEASARC OGIP standard will have separate RMFs and ARFs instead of a combined response, so we have to include these categories.
How do we connect the IRFs to the information used to create them (e.g., spectral models, PSFs, weights; presumably these would be linked to InstrumentResponseFunction in Figure 1 in the same way that PSF is)? It doesn't seem to me that irf_description is flexible enough to capture all the details in a one line explanation.
6.3
In Fig. 1 it appears that there is a one-to-one relationship between StableTimeInterval and InstrumentResponseFunction. However, for Chandra the responses are usually integrated over the total observation (i.e., over all the GTIs), or over the observation run through a time filter, because the responses don't change so quickly (and usually we intergate over the entire PSF as well).
Maybe part of the issue here is that in 3.1.4 we talk about STIs and GTIs in the same way and I'm incorrectly equating the two. I think there is a difference between them in that the implication is that something changes between STIs, whereas the gaps between GTIs typically exist because data are missing or some bad effect is happening in the gaps, but there is no implication of changed responses (or anything) between GTIs. So perhaps instead we need to carefully differentiate STIs and GTIs?
As noted above, I would expect there to be a way to connect (e.g.,) spectral models, weights, ... to the InstrumentResponseFunction in Fig. 1.
Also in Fig. 1, as discussed earlier physical "energy" is too specific for an event.