combining 3 nights data through factor

soumyajitmandal commented 7 years ago

hi everyone,

I am combining 3 nights of data (full subband) with factor. It was going fine until the first amplitude calibration. Probably interpolating them in different time flagged all the data. Here is the message:

2017-02-23 18:20:23 WARNING facetselfcal_facet_patch_809.executable_args:   from _parmdb import ParmDB
2017-02-23 18:21:00 WARNING facetselfcal_facet_patch_809.executable_args: /net/para14/data1/mandal/software/factor/factor/scripts/smooth_amps_spline.py:227: RuntimeWarning: overflow encountered in power
2017-02-23 18:21:00 WARNING facetselfcal_facet_patch_809.executable_args:   amp = 10**amp
2017-02-23 18:21:00 WARNING facetselfcal_facet_patch_809.executable_args: /net/para14/data1/mandal/software/factor/factor/scripts/smooth_amps_spline.py:235: RuntimeWarning: overflow encountered in power
2017-02-23 18:21:00 WARNING facetselfcal_facet_patch_809.executable_args:   return amp_clean, 10**(model[ndata:ndata + ndata]), noisevec[ndata:ndata + ndata], scatter, n_knots, idxbad, weights[ndata:ndata + ndata]
2017-02-23 18:21:00 WARNING facetselfcal_facet_patch_809.executable_args: /net/para14/data1/mandal/software/factor/factor/scripts/smooth_amps_spline.py:461: RuntimeWarning: overflow encountered in square
2017-02-23 18:21:00 WARNING facetselfcal_facet_patch_809.executable_args:   channel_parms_imag[chan]**2) for chan in range(nchans)])
2017-02-23 18:21:00 WARNING facetselfcal_facet_patch_809.executable_args: /net/para14/data1/mandal/software/factor/factor/scripts/smooth_amps_spline.py:273: RuntimeWarning: divide by zero encountered in log10
2017-02-23 18:21:00 WARNING facetselfcal_facet_patch_809.executable_args:   amp = numpy.log10(amp)
2017-02-23 18:21:00 WARNING facetselfcal_facet_patch_809.executable_args: /net/para14/data1/mandal/software/factor/factor/scripts/smooth_amps_spline.py:78: RuntimeWarning: invalid value encountered in subtract
2017-02-23 18:21:00 WARNING facetselfcal_facet_patch_809.executable_args:   scatter = numpy.median(abs(shifted_vec - datavector))
2017-02-23 18:21:00 WARNING facetselfcal_facet_patch_809.executable_args: /software/rhel7/lib64/python2.7/site-packages/numpy/lib/function_base.py:3569: RuntimeWarning: Invalid value encountered in median
2017-02-23 18:21:00 WARNING facetselfcal_facet_patch_809.executable_args:   RuntimeWarning)
2017-02-23 18:21:00 WARNING facetselfcal_facet_patch_809.executable_args: /net/para14/data1/mandal/software/factor/factor/scripts/smooth_amps_spline.py:286: RuntimeWarning: invalid value encountered in greater
2017-02-23 18:21:00 WARNING facetselfcal_facet_patch_809.executable_args:   idxbad = numpy.where((numpy.abs(amp - amp_median)) > scatter*3.)
2017-02-23 18:21:00 WARNING facetselfcal_facet_patch_809.executable_args: /net/para14/data1/mandal/software/factor/factor/scripts/smooth_amps_spline.py:287: RuntimeWarning: invalid value encountered in multiply
2017-02-23 18:21:00 WARNING facetselfcal_facet_patch_809.executable_args:   baddata = numpy.copy(amp)*0.0
2017-02-23 18:21:48 WARNING facetselfcal_facet_patch_809.executable_args: /net/para14/data1/mandal/software/factor/factor/scripts/smooth_amps_spline.py:227: RuntimeWarning: overflow encountered in power
2017-02-23 18:21:48 WARNING facetselfcal_facet_patch_809.executable_args:   amp = 10**amp
2017-02-23 18:21:48 WARNING facetselfcal_facet_patch_809.executable_args: /net/para14/data1/mandal/software/factor/factor/scripts/smooth_amps_spline.py:235: RuntimeWarning: overflow encountered in power
2017-02-23 18:21:48 WARNING facetselfcal_facet_patch_809.executable_args:   return amp_clean, 10**(model[ndata:ndata + ndata]), noisevec[ndata:ndata + ndata], scatter, n_knots, idxbad, weights[ndata:ndata + ndata]
2017-02-23 18:23:07 WARNING facetselfcal_facet_patch_809.executable_args: /net/para14/data1/mandal/software/factor/factor/scripts/smooth_amps_spline.py:227: RuntimeWarning: overflow encountered in power
2017-02-23 18:23:07 WARNING facetselfcal_facet_patch_809.executable_args:   amp = 10**amp
2017-02-23 18:23:07 WARNING facetselfcal_facet_patch_809.executable_args: /net/para14/data1/mandal/software/factor/factor/scripts/smooth_amps_spline.py:235: RuntimeWarning: overflow encountered in power
2017-02-23 18:23:07 WARNING facetselfcal_facet_patch_809.executable_args:   return amp_clean, 10**(model[ndata:ndata + ndata]), noisevec[ndata:ndata + ndata], scatter, n_knots, idxbad, weights[ndata:ndata + ndata]
2017-02-23 18:23:11 WARNING facetselfcal_facet_patch_809.executable_args: /net/para14/data1/mandal/software/factor/factor/scripts/smooth_amps_spline.py:546: RuntimeWarning: invalid value encountered in multiply
2017-02-23 18:23:11 WARNING facetselfcal_facet_patch_809.executable_args:   numpy.cos(phase) * norm_factor)
2017-02-23 18:23:11 WARNING facetselfcal_facet_patch_809.executable_args: /net/para14/data1/mandal/software/factor/factor/scripts/smooth_amps_spline.py:548: RuntimeWarning: invalid value encountered in multiply
2017-02-23 18:23:11 WARNING facetselfcal_facet_patch_809.executable_args:   numpy.sin(phase) * norm_factor)
2017-02-23 18:23:12 DEBUG   facetselfcal_facet_patch_809.executable_args: smooth_amps_spline.py: Normalization-Factor is: 0.0
2017-02-23 18:23:12 DEBUG   facetselfcal_facet_patch_809.executable_args: Results for job 0 submitted by ('132.229.226.24', 52708)
2017-02-23 18:23:12 INFO    node.lofar6.strw.leidenuniv.nl.python_plugin: Total time 169.6410s; user time: 123.5809s; system time: 454.9132s
2017-02-23 18:23:12 DEBUG   node.lofar6.strw.leidenuniv.nl.python_plugin: Start time was 1487870422.8153s; end time was 1487870592.4575s
2017-02-23 18:23:12 DEBUG   facetselfcal_facet_patch_809.executable_args:

Finished preparing output MS
MSReader
  input MS:       /net/para14/data1/mandal/FACTOR_3nights/workingdir/results/facetselfcal/facet_patch_809/L274099_SB000_uv.dppp.pre-cal_12600ED58t_121MHz.pre-cal_chunk0_12600ED58t_4g.mssort_into_Groups
  band            0
  startchan:      0  (0)
  nchan:          4  (0)
  ncorrelations:  4
  nbaselines:     1891
  ntimes:         300
  time interval:  8.01112
  DATA column:    CORRECTED_DATA
  WEIGHT column:  WEIGHT_SPECTRUM
  autoweight:     false
ApplyCal correct_slow.
  parmdb:         /net/para14/data1/mandal/FACTOR_3nights/workingdir/results/facetselfcal/facet_patch_809/L340794_SB000_uv.dppp.pre-cal_126400A74t_121MHz.pre-cal_chunk12_126407AFCt_4g.smooth_amp1
  correction:     gain
    Ampl/Phase:   false
  update weights: false
  sigmaMMSE:      0
  invert:         true
  timeSlotsPerParmUpdate: 500
Averager avg.
  freqstep:       1  timestep:       15
  minpoints:      1
  minperc:        0
MSWriter msout.
  output MS:      /net/para14/data1/mandal/FACTOR_3nights/workingdir/results/facetselfcal/facet_patch_809/L274099_SB000_uv.dppp.pre-cal_12600ED58t_121MHz.pre-cal_chunk0_12600ED58t_4g.apply_amp1
  nchan:          4
  ncorrelations:  4
  nbaselines:     1891
  ntimes:         20
  time interval:  120.167
  DATA column:    DATA
  WEIGHT column:  WEIGHT_SPECTRUM
  Compressed:     no

Processing 300 time slots ...
Finishing processing ...

NaN/infinite data flagged in reader
===================================

Percentage of flagged visibilities detected per correlation:
  [0,0,0,0] out of 2269200 visibilities   [0%, 0%, 0%, 0%]
0 missing time slots were inserted

Flags set by ApplyCal correct_slow.
=======================

Percentage of visibilities flagged per baseline (antenna pair):
 ant    0    1    2    3    4    5    6    7    8    9   10   11   12   13   14
   0         0% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
   1    0%      100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
   2  100% 100%        0% 100%   0%   0%   0%   0%   0% 100%   0% 100% 100% 100%
   3  100% 100%   0%        0%   0%   0%   0% 100%   0% 100%   0%  90% 100% 100%
   4  100% 100% 100%   0%        0%  98%  28% 100% 100% 100% 100%   0% 100% 100%
   5  100% 100%   0%   0%   0%      100%   0% 100%   0% 100%   0%   0%   0% 100%
   6  100% 100%   0%   0%  98% 100%        0% 100% 100% 100% 100% 100% 100% 100%
   7  100% 100%   0%   0%  28%   0%   0%      100%   0% 100% 100% 100% 100% 100%
   8  100% 100%   0% 100% 100% 100% 100% 100%        0%  34%   0% 100% 100% 100%
   9  100% 100%   0%   0% 100%   0% 100%   0%   0%        0%   0% 100% 100% 100%
  10  100% 100% 100% 100% 100% 100% 100% 100%  34%   0%        0% 100%   0%   0%
  11  100% 100%   0%   0% 100%   0% 100% 100%   0%   0%   0%      100%   0%  64%
  12  100% 100% 100%  90%   0%   0% 100% 100% 100% 100% 100% 100%        0% 100%
  13  100% 100% 100% 100% 100%   0% 100% 100% 100% 100%   0%   0%   0%        0%
  14  100% 100% 100% 100% 100% 100% 100% 100% 100% 100%   0%  64% 100%   0%
  15  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%   0%
  16    0%   0%   0%   0%   0%   0%   0%   0%   0%   0%   0%   0%   0%   0%   0%
  17    0%   0%   0%   0%   0%   0%   0%   0%   0%   0%   0%   0%   0%   0%   0%
  18  100% 100% 100% 100%  93% 100%  98% 100% 100% 100% 100% 100% 100% 100% 100%
  19  100% 100% 100% 100%  96% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  20  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  21  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  22  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  23  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  24  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  25  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  26  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  27  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  28  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  29  100% 100% 100% 100%  96% 100%  99% 100% 100% 100% 100% 100% 100% 100% 100%
  30  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  31  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  32  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  33  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  34  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  35  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  36  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  37  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  38  100% 100% 100% 100%  99% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  39  100% 100% 100% 100%  99% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  40  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  41  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  42  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  43  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  44  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  45  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  46  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  47  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  48  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  49  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  50  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  51  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  52  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  53  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  54  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  55  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  56  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  57  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  58  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  59  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  60  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  61  100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
TOTAL  95%  95%  85%  85%  90%  84%  92%  87%  91%  85%  89%  85%  92%  89%  91%

AHorneffer commented 7 years ago

I prefer it if you wouldn't copy&paste parts of (generic-)pipeline logfiles here (they are bleeping hard to read and half of it is usually missing), but attach the entire logfile to the post.

Indeed the smoothing fails, which is then the reason why most of the data gets flagged. (That's what NDPPP does it if encounters a NAN as a calibration value it is asked to apply to data.)

@rvweeren (or @darafferty): does the smooth_amps_spline.py script implicitly assume that the data was taken during only one day? E.g. by assuming that the amplitudes can be modeled over the full time-range by a low-order polynomial?

@soumyajitmandal: You could try setting spline_smooth2D to False in the Factor parset.

soumyajitmandal commented 7 years ago

I used spline_smooth2D = False but the error still exists. parmdbplot.py *4g.merge_amp_parmdbs1 looks fine.

Now the smooth_amps.py did not work out.

I ran this outside factor: factor/scripts/smooth_amps.py 4g.merge_amp_parmdbs1 smooth_amp1_test and parmdbplot.py *smooth_amp1_test shows amplitudes and phases are blank. The output messages were:

invalid value encountered in median

2.invalid value encountered in greated high_ind = numpy.where(amp > 5.0)

I would like to point out, few months ago, I tried factor for two different nights with 40 subbands and we fixed the problem if we had two different antennas , one being flagged (so essentially two different time span) #76 So I repeated the same task for the older run, and it did NOT fail giving also a norm_factor = 1.0011506 . The produced *smoth_amp1_test in this case is fine.

The only differences in these two times are: Full subband and 3 nights data.

soumyajitmandal commented 7 years ago

The previous run was with different version of NDPPP than the recent one. So the two parmdbs1 were created with different lofar softwares. I think previously if the interpolation did not find a value, it used to put Zeros but now its putting NaNs instead. Is this the issue, probably?

AHorneffer commented 7 years ago

I used spline_smooth2D = False but the error still exists.

But the error message you quoted is from smooth_amps_spline.py, i.e. the spline smoothing script. So you should at least get a different error message.

soumyajitmandal commented 7 years ago

I thought putting spline_smooth2D = False turns off the use of smooth_amps_phases_spline.py not the smooth_amps.py , right? So the error am I got from the last run, was from smooth_amps.py

rvweeren commented 7 years ago

spline_smooth2D = False turns off spline smoothing over the frequency axis. It still does a spline smooth across the time axis. (I think it it always uses smooth_amps_phases_spline.py and smooth_amps.py is not used anymore if I am correct)

AHorneffer commented 7 years ago

@rvweeren: Ah, O.K. When smoothing along the time axis, does the smooth_amps_spline.py script implicitly assume that the data was taken during only one day? (E.g. by assuming that the amplitudes can be modeled over the full time-range by a low-order polynomial?)

rvweeren commented 7 years ago

I checked smooth_amps_spline.py the script and it creates a time axis times = numpy.copy(sorted( parms[key_names[0]]['times']))

Maybe it fails because of that and it cannot handle a very large gap (although looking at the code the spline does not directly use that time axis in the spline fit). The easiest way to debug this is to take the parmdb and run smooth_amps_spline.py manually on it and check where it fails in the script.

rvweeren commented 7 years ago

Hmm, update/correction, apparently it does use smooth_amps.py if spline_smooth2d=False

if self.parset['calibration_specific']['spline_smooth2d']: smooth_amps_task = 'smooth_amps_spline' else: smooth_amps_task = 'smooth_amps'

(from facet_ops.py)

rvweeren commented 7 years ago

It might be that the script is failing because there are NaN input values. Otherwise I cannot see why

high_ind = numpy.where(amp > 5.0)

could give an error message. I guess you need to open the scripts and do some debugging here and figure out precisely where it goes wrong, in smooth_amps_spline.py and smooth_amps.py Give it a try and see how far you can get, the scripts are not very complicated (if you still remain stuck provide the parmdb)

soumyajitmandal commented 7 years ago

yeah indeed the error was with smooth_amps.py this time.

so I ran: smooth_amps.py on the merged parmdb: smooth_amps.py 4g.merge_amp_parmdbs1 smooth_amp1_test

I did a print on 'ampl' after this line: ampl_tot_copy = numpy.copy(ampl)

Where the values were NaNs. Whereas, in my successful run few months earlier, doing the same thing gives me Zeros.

rvweeren commented 7 years ago

You need get to get back further, ampl_tot_copy = numpy.copy(ampl) is already too deep into the script.

The question for you to answer is (1) does the input parmdb contain NaNs and (2) is that the reason why it fails (becausesmooth_amps.py is not NaN proof).

Check channel_parms_real and channel_parms_imag on line 125/126.

soumyajitmandal commented 7 years ago

yes channel_parms_real and channel_parms_imag also have NaN values. So the input parmdb has NaN values. Whereas, earlier it used to have Zeroes.

rvweeren commented 7 years ago

Ok, so it looks like smooth_amps.py is simply not NaN proof.

soumyajitmandal commented 7 years ago

hmm okay. Is it a good idea to put zeros in place of NaNs? Or is it not a good solution?

rvweeren commented 7 years ago

You should try to edit smooth_amps.py so that it is NaN proof (with minimal other changes). Without having looked at it in detail I think that should not be very difficult to do.

(I probably do not have time to look at it myself over the next two weeks, after that I might have time to help with that and also check smooth_amps_spline.py, because in the end it is preferable to use spline_smooth2d as is more capable in detecting amplitude outliers)

AHorneffer commented 7 years ago

@soumyajitmandal: Can you put a parmDB with NANs somewhere where I can find it, to test the code?

@all: What should we do with the flagged data? Replacing the amplitudes with the median value is straight forward, but what should we do with the phases? Setting them to zero would be the most simple. Finding a useful median for phases is not only more complicated, but I also don't know if it is a good idea.

soumyajitmandal commented 7 years ago

I did a test by putting zeroes instead of NaNs but normalisation is messed up in that process. Rather using a masked array might be useful ? channel_parms_real = numpy.ma.masked_invalid(channel_parms_real) but in this way, the median function might not work though.

Attached is the parmdb. lockman_amp_parmdbs1.zip

soumyajitmandal commented 7 years ago

Including the masked array in a different part seems to be working so far in my case. I put spline_smooth2D = False which means it is using smooth_amps.py

I changed (line 146): amp = numpy.ma.masked_invalid(numpy.copy(numpy.sqrt(real2 + imag2))).compressed() Previously while it was trying to create image31, it was failing since everything was flagged for the NaN entries and no norma_factor was found. Now till image 42 has been created and parmdbs look fine as well.

AHorneffer commented 7 years ago

Well, having had a look at the parmDB you attached here I think it would be important to find our why you have so many NANs in the parmDB. Did you flag large parts of the data? (And why would NDPPP create parmDB entries with NANs in that case, instead of not creating the entries at all.) Or are there parts of the data where NDPPP couldn't get a solution even if there was data?

soumyajitmandal commented 7 years ago

Since there is a time gap between different nights, I thought its producing the NaNs. In general when I processed different nights data separately, I did not see the NaN issue.

AHorneffer commented 7 years ago

Well, the smoothing is done on single time-series (i.e. separate for antenna, polarization, and channel), and several of these time-series are fully flagged.

AHorneffer commented 7 years ago

Btw. here is a version of the script, that will not only work with NANs, but also doesn't produce the RuntimeWarnings: smooth_amps.py.txt

soumyajitmandal commented 7 years ago

I will try this version, thanks a lot. I was trying with the temporary fix that I wrote in my previous comment (which I think you also put in the modified text file) and have an error. I reproduced the error outside the pipeline while using the convert_solutions_to_gain.py

convert_solutions_to_gain.py .pre-cal_chunk12_126407AFCt_4g.merge_phase_parmdbs .pre-cal_chunk12_126407AFCt_4g.smooth_amp2 gain_test

Traceback (most recent call last): File "/net/para14/data1/mandal/software/factor_normalize/factor/scripts/convert_solutions_to_gain.py", line 165, in main(args.fast_selfcal_parmdb, args.slow_selfcal_parmdb, args.output_file, preapply_parmdb=args.preapply_parmdb) File "/net/para14/data1/mandal/software/factor_normalize/factor/scripts/convert_solutions_to_gain.py", line 96, in main fast_timewidths, asStartEnd=False) File "/net/lofar1/data1/oonk/rh7_lof_feb2017_2_19_0/lofar/lib64/python2.7/site-packages/lofar/parmdb/init.py", line 147, in getValues includeDefaults) Boost.Python.ArgumentError: Python argument types in ParmDB._getValues(parmdb, str, numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray, bool, bool) did not match C++ signature: _getValues(LOFAR::BBS::PyParmDB {lvalue}, std::string parmnamepattern, double sfreq, double efreq, double freqstep, double stime, double etime, double timestep, bool asStartEnd=True, bool includeDefaults=False) _getValues(LOFAR::BBS::PyParmDB {lvalue}, std::string parmnamepattern, double sfreq=-1e+30, double efreq=1e+30, double stime=-1e+30, double etime=1e+30, bool asStartEnd=True, bool includeDefaults=False)

Has anyone seen this earlier?

darafferty commented 7 years ago

I found and fixed a problem in convert_solutions_to_gain.py with parmdbs with large gaps. @soumyajitmandal, can you try your run again? (The fix in only available on the latest master, so if you want to use this with an earlier version of Factor, you'll need to copy the new version into your Factor installation by hand.)

soumyajitmandal commented 7 years ago

Thanks David! I have tried this code last week after we chatted at the busy week so I do have the parmdbs created with the latest version. I will just do a git pull and rerun it.

soumyajitmandal commented 7 years ago

Hi David,

it seems like convert_solutions_gain.py works now. Probably we need the same kind of fix in the reset_amps.py? Thats where its failing now. I tried to run this outside factor and had a similar error:

reset_amps.py L340794_SB000_uv.dppp.pre-cal_126400A74t_121MHz.pre-cal_chunk12_126407AFCt_4g.convert_merged_selfcal_parmdbs test_parm Traceback (most recent call last): File "./reset_amps.py", line 77, in main(args.instrument_name, args.instrument_name_reset) File "./reset_amps.py", line 32, in main freqs = parms['Gain:1:1:Ampl:{s}'.format(s=antenna_list[0])]['freqs'] IndexError: list index out of range

darafferty commented 7 years ago

I think this problem might be fixed by commit 4dc0157. To test it, update Factor, reset the state for the pipeline so that it repeats the convert_merged_selfcal_parmdbs step, then rerun.

soumyajitmandal commented 7 years ago

hmm this time I think I encountered a different error:

log4cplus:ERROR No appenders could be found for logger (CEP.ParmDB.EXCEPTION). log4cplus:ERROR Please initialize the log4cplus system properly. Traceback (most recent call last): File "./reset_amps.py", line 77, in main(args.instrument_name, args.instrument_name_reset) File "./reset_amps.py", line 63, in main times[g_start:], timewidths[g_start:], asStartEnd=False) File "/net/lofar1/data1/oonk/rh7_lof_feb2017_2_19_0_ER/lofar/lib64/python2.7/site-packages/lofar/parmdb/init.py", line 427, in addValues asStartEnd, type)}) RuntimeError: Assertion: int(grid.nx()) == nshape[0] && int(grid.ny()) == nshape[1]; Mismatch in shape of coeff and grid for scalar parameter Gain:0:0:Phase:CS301HBA0

I plotted the convert_merged_selfcal_parmdbs and the test_parmdb ( reset_amps.py convert_merged_selfcal_parmdbs test_parmdb) The amplitude part for the test_parmdb is zero everywhere. Also, the test_parmdb* contains only 14 stations (CS301HBA0 being one of them)

darafferty commented 7 years ago

I committed a fix to reset_amps.py (similar to the one for convert_solutions_to_gain.py) that might fix the above problem. Can you try it again? As before, you will need to reset the state for the pipeline so that it repeats the reset_amps step, then rerun.

darafferty commented 7 years ago

I just fixed a bug that affected both convert_solutions_to_gain.py and reset_amps.py, so unfortunately you'll need to update and reset to the convert_merged_selfcal_parmdbs step.

soumyajitmandal commented 7 years ago

Ah okay. I will do it and let you know. Thank you.

soumyajitmandal commented 7 years ago

I tried it. It fails at the same place. I plotted the parmdbs and I think the error is a bit clearer. For one of the observations, one of the stations (CS301) was flagged. parmdbplot.py *.convert_merged_selfcal_parmdbs

CS101: CS101.pdf

CS301: CS301.pdf

the time stamps are different for two different antennas. I have seen something similar in the smooth_amps stage quite a while ago (#76). Something similar?

darafferty commented 7 years ago

OK, I added a check for missing stations to reset_amps.py, so update and give it another try.

botteon commented 7 years ago

Hi David, after the last update of yesterday on CEP3 I have this error

2017-05-09 11:23:18 ERROR   facetselfcal_facet_patch_574: Failed pipeline run: facet_patch_574
2017-05-09 11:23:18 ERROR   facetselfcal_facet_patch_574: Detailed exception information:
2017-05-09 11:23:18 ERROR   facetselfcal_facet_patch_574: <class 'lofarpipe.support.lofarexceptions.PipelineRecipeFailed'>
2017-05-09 11:23:18 ERROR   facetselfcal_facet_patch_574: convert_solutions_to_gain failed

I think it is related to the new commit.

botteon commented 7 years ago

The error seems to be:

2017-05-09 11:23:16 ERROR   node.lof004.python_plugin: local variable 'gaps_ind' referenced before assignment

darafferty commented 7 years ago

Thanks -- the 'gaps_ind' problem should be fixed now (and I updated it on CEP3).

soumyajitmandal commented 7 years ago

Okay now I think it has passed the reset_amps.py stage. By the way, is there a plotting issue with parmdbplot.py? I was checking parmdbplot.py *4g.create_preapply_parmdb and the 'polar' plot shows the amplitude to be zero everywhere. Real and Imaginary part look fine.

Anyway, now its failing at the plotting solutions step. But its due to the size issue. here it is: plot_selfcal_solutions.py -p L340794_SB000_uv.dppp.pre-cal_126400A74t_121MHz.pre-cal_chunk12_126407AFCt_4g.merge_selfcal_parmdbs hola /software/rhel7/lib64/python2.7/site-packages/numpy/ma/core.py:852: RuntimeWarning: invalid value encountered in greater_equal return umath.absolute(a) self.tolerance >= umath.absolute(b) ./plot_selfcal_solutions.py:43: RuntimeWarning: invalid value encountered in less out[out < -np.pi] += 2.0 np.pi ./plot_selfcal_solutions.py:44: RuntimeWarning: invalid value encountered in greater out[out > np.pi] -= 2.0 np.pi Traceback (most recent call last): File "./plot_selfcal_solutions.py", line 662, in refstation=args.refstation, fourpol=args.fourpol) File "./plot_selfcal_solutions.py", line 620, in main solplot_phase(parmdb, imageroot, refstation, plot_international=plot_international, fourpol=fourpol) File "./plot_selfcal_solutions.py", line 424, in solplot_phase axsp[istat][0].plot(times, normalize(phase00-phase00_ref_chan), color='b', marker=fmt, ls=ls, label='Gain:0:0:Phase',mec='b') File "/software/rhel7/lib64/python2.7/site-packages/numpy/ma/core.py", line 3971, in sub return subtract(self, other) File "/software/rhel7/lib64/python2.7/site-packages/numpy/ma/core.py", line 1006, in call result = self.f(da, db, args, **kwargs) ValueError: operands could not be broadcast together with shapes (260,) (390,)

darafferty commented 7 years ago

I've modified the plot_selfcal_solutions.py script to work with multiple nights. It should now produce one plot per night (instead of all the nights in a single plot, which was unreadable).

soumyajitmandal commented 7 years ago

okay. Probably its gonna work, but it says: global name 'fp' is not defined

darafferty commented 7 years ago

Oops -- copy and paste error. Try it now.

soumyajitmandal commented 7 years ago

plotting step passed. Plots look quite fine. now its preparing the imaging chunk dataset. I will keep updated. :)

soumyajitmandal commented 7 years ago

Good news is the code seems bug free for multiple nights now! :)

Some issues (not related with code though I think). The image looks bad after combining three different nights, individual images look quite similar. I attach calibrator image from one of the nights and the same calibrator from multiple nights: 1night: l299961_sb000_uv dppp pre-cal_12615e3act_121mhz pre-cal_chunk12_126165434t_4g wsclean_image42_iter9-mfs-image

3nights: l340794_sb000_uv dppp pre-cal_126400a74t_121mhz pre-cal_chunk12_126407afct_0g wsclean_image42_iter1-mfs-image

Looks like amplitude and Tec had problems (?). Phases are fine.

Amplitude (night1): l340794_sb000_uv dppp pre-cal_126400a74t_121mhz pre-cal_chunk12_126407afct_0g make_selfcal_plots_amp_channel0_period0

Amplitude(night2): l340794_sb000_uv dppp pre-cal_126400a74t_121mhz pre-cal_chunk12_126407afct_0g make_selfcal_plots_amp_channel0_period1

TECscalarphase (night1): l340794_sb000_uv dppp pre-cal_126400a74t_121mhz pre-cal_chunk12_126407afct_0g make_selfcal_plots_tec_scalarphase_channel0_period0

TECscalarphase (night2): l340794_sb000_uv dppp pre-cal_126400a74t_121mhz pre-cal_chunk12_126407afct_0g make_selfcal_plots_tec_scalarphase_channel0_period1

How are the amplitudes applied when we merge different observations? I mean does it get averaged? Also, do you think its worth giving a try by flagging the noisy TEC solutions?

darafferty commented 7 years ago

Factor doesn't do anything special with data from multiple nights -- they're handled just like data from a single night. So, it will smooth the amplitudes and normalize them in the same way (so a single normalization is done across all three nights). No averaging is done.

It's probably good to flag the periods during night 1 when the solutions are noisy (between hours 6-7 and after hour 8). I'm not sure whether they're the cause of the poor results, though.

twshimwell commented 7 years ago

Its very odd that the background noise essentially looks at the same level even though you have 3 times the data. I'd perhaps think the artefacts could stay similar but the noise should really good down a fair bit.

I guess the "sources" in the 3day image that pop up near your bright source are not real? Did you check the masks throughout the calibration of the 3 day one?

soumyajitmandal commented 7 years ago

@darafferty I am also imaging them separately now (i.e: each facet has been gone through factor, I used a single model; added that to every night dataset; used gaincal-applycal. now imaging is running, by tomorrow I can see the result I hope).

@darafferty @twshimwell Lets wait till the facet is finished. I checked the full image in the facetselfcal directory (which only has 1/6th ob the bandwidth, is there an option to include all band? Factor does not have that settings anymore, its by default 6). In the facetimage directory (which is running now) it should have the whole BW. I am expecting the background noise might be a bit better than what we are seeing now.

AHorneffer commented 7 years ago

@soumyajitmandal I was wondering: did you make sure that the same "InitSubtract" model was subtracted from all three nights? E.g. by running Initial-Subtract on all three nights together, or by subtracting the model of one night from the other two.

If you have different models subtracted from the different nights, then Factor will screw up. (It will just assume that the same model was subtracted from all data and thus treat two of the nights wrong.)

soumyajitmandal commented 7 years ago

Each night should be init-subtracted separately right?

Let me explain what I did: I processed three different nights independently till the init-subtract step. So after init-subtraction step, for each night I have (24, 24, 23 in total 71) low2-model.merge skymodels. In my Factor run, the msfiles directory contains all the 71 *.ms files and these low2-model.merge files.

AHorneffer commented 7 years ago

Each night should be init-subtracted separately right?

No! (Feel free to add a few more exclamation marks, blinking effects or so.)

Factor will use one model for all files in one frequency band. If you actually give it files for the same frequency band in which different models have been subtracted, then it will screw up. The way you started it, it will randomly(*) choose one of the three skymodels and use that for adding back the sources to the data from all three nights. So the two nights for which another model has been subtracted will get treated wrong.

My suggestion to fix that is to choose the model of one of the three nights, and subtract that from the other two nights.

(*) Well, not actually randomly, but it will be undefined behavior.

twshimwell commented 7 years ago

this is the behaviour even if the models are explicitly specified in the factor parset?

lofar-astron / factor

combining 3 nights data through factor #194