lmorabit / lofar-vlbi

GNU General Public License v3.0
16 stars 13 forks source link

ddf->Delay-Calibration indexing error #97

Open vmahat opened 1 year ago

vmahat commented 1 year ago

The Delay-Calibration part of the pipeline crashes at the beginning of the ndppp_applycal substep of the apply_ddf step.

2023-06-22 14:36:32 DEBUG   genericpipeline.executable_args: Adding node_logging_information
2023-06-22 14:36:32 DEBUG   genericpipeline.executable_args: Writing data map file: /beegfs/general/mahatmav/lofar/long_baseline/3C351//Delay-Calibration_3C351_ep1_$
2023-06-22 14:36:32 INFO    genericpipeline.executable_args: recipe executable_args completed
2023-06-22 14:36:32 INFO    genericpipeline: Beginning step ndppp_applycal
2023-06-22 14:36:32 INFO    genericpipeline: Running task: dppp
2023-06-22 14:36:32 INFO    genericpipeline.executable_args: recipe executable_args started
2023-06-22 14:36:32 INFO    genericpipeline.executable_args: Starting /opt/lofar/lofar/bin/NDPPP run
2023-06-22 14:36:32 DEBUG   genericpipeline.executable_args: Pipeline start time: 2023-06-22T13:07:42
2023-06-22 14:36:32 ERROR   genericpipeline.executable_args: Exception caught: list index out of range
Traceback (most recent call last):
  File "/opt/lofar/lofar/lib64/python2.7/site-packages/lofarpipe/cuisine/WSRTrecipe.py", line 132, in run
    status = self.go()
  File "/opt/lofar/lofar/lib64/python2.7/site-packages/lofarpipe/recipes/master/executable_args.py", line 363, in go
    parsetdict_copy[k] = value[i]
IndexError: list index out of range
2023-06-22 14:36:32 WARNING genericpipeline: dppp reports failure (using executable_args recipe)
2023-06-22 14:36:32 ERROR   genericpipeline: *******************************************
2023-06-22 14:36:32 ERROR   genericpipeline: Failed pipeline run: Delay-Calibration_3C351_ep1_ddf
2023-06-22 14:36:32 ERROR   genericpipeline: Detailed exception information:
2023-06-22 14:36:32 ERROR   genericpipeline: <class 'lofarpipe.support.lofarexceptions.PipelineRecipeFailed'>
2023-06-22 14:36:32 ERROR   genericpipeline: dppp failed

This happens because Delay-Calibration has produced 22 msdpppconcat files, but there are only 21 .pre-cal_ddf.h5 files to apply. There is also an inconsistent central frequency between 4 msdpppconcat and .pre-cal_ddf.h5 files (between 130 MHz and 138 MHz), see below.

L541949_SB000_uv_128830FF9t_127MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_129MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_130MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_132MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_134MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_136MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_138MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_140MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_142MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_144MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_146MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_148MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_150MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_152MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_154MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_156MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_158MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_160MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_162MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_164MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_166MHz.msdpppconcat    
L541949_SB000_uv_128830FF9t_168MHz.msdpppconcat    
L541949_SB000_uv_128830FFAt_127MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_129MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_131MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_133MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_135MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_137MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_138MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_140MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_142MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_144MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_146MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_148MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_150MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_152MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_154MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_156MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_158MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_160MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_162MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_164MHz.pre-cal_ddf.h5  
L541949_SB000_uv_128830FFAt_166MHz.pre-cal_ddf.h5

Running make_mslists.py (the script that produces "big-mslist.txt", a list of MS files used as input for ddf), that causes ddf to produce the solutions as shown here (the last subband had a flagged fraction of 95%, hence did not have a ddf h5parm, but that subband is produced by Delay-Calibration as shown, leading to the index error).

Note this is non-LoTSS data, with 48kHz subband width and 16 channels per subband.

lmorabit commented 1 year ago

Hi Vijay -- okay, this is helpful. Do the *.msdpppconcat frequencies match pre-facet-target?

vmahat commented 1 year ago

Hmm, seems not entirely. The number of files matches (22), but there is inconsistency of central frequency between 130 MHz and 138 MHz as above

Pre-Facet-Target:

L541949_SB000_uv_128830FFAt_127MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_127MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_129MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_129MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_131MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_131MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_133MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_133MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_135MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_135MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_137MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_137MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_138MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_138MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_140MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_140MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_142MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_142MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_144MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_144MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_146MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_146MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_148MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_148MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_150MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_150MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_152MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_152MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_154MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_154MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_156MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_156MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_158MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_158MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_160MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_160MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_162MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_162MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_164MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_164MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_166MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_166MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_168MHz.ms_concat_target_0         
L541949_SB000_uv_128830FFAt_168MHz.ms_concat_target_0_CONCAT  
L541949_SB000_uv_128830FFAt_168MHz.msdpppconcat               
L541949_SB000_uv_128830FFAt_168MHz.msdpppconcat.h5            
L541949_SB000_uv_128830FFAt_168MHz.msdpppconcat.h5imp_gsmcal 
tikk3r commented 1 year ago

There is also an inconsistent central frequency between 4 msdpppconcat and .pre-cal_ddf.h5 files (between 130 MHz and 138 MHz), see below.

Theses msdpppconcat you are talking about come from the concatenate in Delay-Calibration right? This inconsistency might be due to a difference in reference SB that was used for the concatenate there and in prefactor. Are they the same between your prefactor and lofar-vlbi runs? Delay-Calibration.parset uses a hardcoded value:

https://github.com/lmorabit/lofar-vlbi/blob/8478a28002a1e6bdb8ad9c65dc7d4b3792a0b35a/Delay-Calibration.parset#L62-L63

While prefactor's default is set to None:

## concatenating the target data
! num_SBs_per_group         =  10                                                                 ## make concatenated measurement-sets with that many subbands, choose a high number if running LBA
! reference_stationSB       =  None                                                               ## station-subband number to use as reference for grouping, "None" -> use lowest frequency input data as reference

You could to try setting reference_stationSB = None in your Delay-Calbration parset or rerunning prefactor/LINC with the same reference SB. The former is probably easiest, as the latter would also require rerunning ddf-pipeline.

vmahat commented 1 year ago

Okay, I will re-run Delay-Calibration (since it's quicker) -- we'll see what happens by Monday. Though, if this inconsistency between parsets was the cause then I'm surprised this hasn't happened before

vmahat commented 1 year ago

The Delay-Calibration has now finished without errors -- thanks Frits. I'll close the ticket, but perhaps we ought to change this in the Delay-Calibration parset.

lmorabit commented 1 year ago

Reopening this issue because the behaviour between pre-facet-target and delay-calibration should be the same. We had to set this because there were issues of inconsistent concatenation if it wasn't set, so this needs further investigation on how to ensure it's always the same.