Need to propagate NANT and NPOL properly in the header of stats files

vivekvenkris commented 1 month ago

Currently the header of an example stats file contains

HEADER       DADA
HDR_VERSION  1.0
HDR_SIZE     4096
DADA_VERSION 1.0

FILE_SIZE    624999997440
FILE_NUMBER  0

UTC_START    1708082168.958020
MJD_START    60356.47024305579093

SOURCE       J1644-4559
RA           16:44:49.27
DEC          -45:59:09.7
TELESCOPE    MeerKAT
INSTRUMENT   CBF-Feng
RECEIVER     L-band
FREQ         929562500.000000
BW           13375000.000000
TSAMP        156796.411215
STOKES       I

NBIT         8
NDIM         1
NPOL         1
NCHAN        64
NBEAM        1
ORDER        FPA

CHAN0_IDX    320
OBS_NCHAN    4096
OBS_FREQUENCY 1284000000.000000
OBS_BW       856000000.000000
NSAMP        1
NDMS         1
DMS          0
COHERENT_DM  0.000000
STOKES_MODE  I
OBS_OFFSET   0
OBS_OVERLAP  0

Some parameters like NBEAM, NDMS, DMS, STOKES_MODE are irrelevant for this file. NPOL is incorrectly set NDIMis also incorrect (if we assume there are 4 values per FPA NANT is missing.

ewanbarr commented 1 month ago

Can we be specific about what the values of these things should be:

What is NPOL for stokes data?
What is NDIM for stokes data?
NDIM = 4 for FPA data doesn't really make sense

The DSPSR documentation states: NDIM: dimension of each time sample (1=real; 2=complex) [default: 1]

i.e. there is no guarantee that the inner dimension of FPA data is 4 floats. It is a structure, hence NDIM should = 1.

Is NANT the _header.nantennas?

ewanbarr commented 1 month ago

Currently there is a generic output writer that takes any DescribedVector of any type and writes a header. If substantially different behaviours are needed for each type, this can be achieved either by specialising the create steam method:

template <typename VectorType>
FileOutputStream&
MultiFileWriter<VectorType>::create_stream(VectorType const& stream_data,
                                           std::size_t stream_idx)

So that it reads:

template <typename T>
FileOutputStream&
MultiFileWriter<FPAStatsH<T>>::create_stream(FPAStatsH<T> const& stream_data,
                                           std::size_t stream_idx)

Alternatively we can use type_traits:

if constexpr (std::is_same_v<VectorType, FPAStatsH<typename VectorType::value_type>>)
{
// Do FPAStatsH stuff.
}

vivekvenkris commented 1 month ago

Can we be specific about what the values of these things should be:
1. What is NPOL for stokes data?

Should be 4 according to usual convention

2. What is NDIM for stokes data?

One as they are intensities

3. NDIM = 4 for FPA data doesn't really make sense
The DSPSR documentation states: NDIM: dimension of each time sample (1=real; 2=complex) [default: 1]

i.e. there is no guarantee that the inner dimension of FPA data is 4 floats. It is a structure, hence NDIM should = 1.

Yes, to do this properly we need NDIM 1 but have a new header parameter like DTYPE MOMENTS. The same would be DTYPE INTENSITY and DTYPE VOLTAGE for other files ?
4. Is NANT the _header.nantennas?

It should be the nantennas in the file, This if not padded already, SKYWEAVER_NANTENNAS if padded.

vivekvenkris commented 1 month ago

NBIT 8 is also incorrect for FPA since they are floats.

ewanbarr commented 1 month ago

NBIT 8 is also incorrect for FPA since they are floats.

This just due to stale values from the default header template we use:

std::string default_dada_header = R"(
HEADER       DADA
HDR_VERSION  1.0
HDR_SIZE     4096
DADA_VERSION 1.0

FILE_SIZE    100000000000
FILE_NUMBER  0

UTC_START    1708082229.000020336 
MJD_START    60356.47024305579093

SOURCE       J1644-4559
RA           16:44:49.27
DEC          -45:59:09.7
TELESCOPE    MeerKAT
INSTRUMENT   CBF-Feng
RECEIVER     L-band
FREQ         1284000000.000000
BW           856000000.000000
TSAMP        4.7850467290
STOKES       I

NBIT         8
NDIM         1
NPOL         1
NCHAN        64
NBEAM       800
ORDER        TFB

CHAN0_IDX 2688
)";

Does NBIT 32 == float or int or unsigned int?

ewanbarr commented 1 month ago

From your answer I think it makes sense to have a private create_stream_impl method on MultiFileWriter that can be specialised by type.

vivekvenkris commented 1 month ago

I think it works well, I have changed a few things on the MultiFileWriter side to support skycleaving - so let's discuss those changes first before implementing.

I will push these changes today - just doing last minute checks

ewanbarr commented 4 weeks ago

I had a bash at this anyway and have a solution that may or may not be temporary. Rather than have specialisations for create I just added a struct helper for getting some parameters of each type:

template <typename T, typename Enable = void> struct HeaderFormatter;

template <typename T>
struct HeaderFormatter<T, typename std::enable_if_t <
    std::is_same<T, FPAStatsH<typename T::value_type>>::value || 
    std::is_same<T, FPAStatsD<typename T::value_type>>::value>>
{
    void operator()(T const& stream_data, 
                    ObservationHeader const& obs_header,
                    PipelineConfig const& config,
                    Header& output_header){
        output_header.set<std::size_t>("NCHAN", stream_data.nchannels());
        output_header.set<std::size_t>("NSAMP", stream_data.nsamples());
        output_header.set<std::string>("STOKES_MODE", "I");
        output_header.set<std::size_t>("NPOL", stream_data.npol());
        output_header.set<std::size_t>("NDIM", 1);
        output_header.set<std::size_t>("NBIT", 32);
        output_header.set<std::size_t>("NANT", stream_data.nantennas());
        output_header.set<std::string>("DTYPE", "MOMENTS");       
    }
};

Other parameters are are considered default and are put in the header for all types.

vivekvenkris commented 3 weeks ago

HI @ewanbarr is this committed somewhere?

erc-compact / skyweaver

Need to propagate NANT and NPOL properly in the header of stats files #12