lmorabit / lofar-vlbi

GNU General Public License v3.0
16 stars 13 forks source link

Pipeline finished with ERRORS/WARNINGS #91

Closed cyriltasse closed 1 year ago

cyriltasse commented 1 year ago

Hi,

I'm not sure how to debug this: the pipeline Delay-Calibration.parset finished with the fillowing

Reading task definition file(s): /opt/lofar/lofar/share/pipeline/tasks.cfg
2023-04-14 12:27:08 WARNING genericpipeline.executable_args: /opt/lofar/lofar/lib64/python2.7/site-packages/lofarpipe/support/utilities.pyc : Using default subprocess module!
2023-04-14 12:27:08 WARNING genericpipeline.executable_args: Debug: registered context Global=0
2023-04-14 12:27:16 WARNING genericpipeline.executable_args: /data/cyril.tasse/DEV_VLBI/lofar_facet_selfcal/facetselfcal.py:69: UserWarning: 
2023-04-14 12:27:16 WARNING genericpipeline.executable_args: This call to matplotlib.use() has no effect because the backend has already
2023-04-14 12:27:16 WARNING genericpipeline.executable_args: been chosen; matplotlib.use() must be called *before* pylab, matplotlib.pyplot,
2023-04-14 12:27:16 WARNING genericpipeline.executable_args: or matplotlib.backends is imported for the first time.
2023-04-14 12:27:16 WARNING genericpipeline.executable_args: 
2023-04-14 12:27:16 WARNING genericpipeline.executable_args: The backend was *originally* set to 'TkAgg' by the following code:
2023-04-14 12:27:16 WARNING genericpipeline.executable_args:   File "/data/cyril.tasse/DEV_VLBI/lofar_facet_selfcal/facetselfcal.py", line 41, in <module>
2023-04-14 12:27:16 WARNING genericpipeline.executable_args:     import bdsf
2023-04-14 12:27:16 WARNING genericpipeline.executable_args:   File "/opt/lofar/pyenv-py2/lib64/python2.7/site-packages/bdsf-1.10.1-py2.7-linux-x86_64.egg/bdsf/__init__.py", line 14, in <module>
2023-04-14 12:27:16 WARNING genericpipeline.executable_args:     import matplotlib.pyplot as pl
2023-04-14 12:27:16 WARNING genericpipeline.executable_args:   File "/opt/lofar/pyenv-py2/lib64/python2.7/site-packages/matplotlib/pyplot.py", line 71, in <module>
2023-04-14 12:27:16 WARNING genericpipeline.executable_args:     from matplotlib.backends import pylab_setup
2023-04-14 12:27:16 WARNING genericpipeline.executable_args:   File "/opt/lofar/pyenv-py2/lib64/python2.7/site-packages/matplotlib/backends/__init__.py", line 17, in <module>
2023-04-14 12:27:16 WARNING genericpipeline.executable_args:     line for line in traceback.format_stack()
2023-04-14 12:27:16 WARNING genericpipeline.executable_args: 
2023-04-14 12:27:16 WARNING genericpipeline.executable_args: 
2023-04-14 12:27:16 WARNING genericpipeline.executable_args:   matplotlib.use('Agg')
2023-04-14 12:27:16 WARNING genericpipeline.executable_args: Traceback (most recent call last):
2023-04-14 12:27:16 WARNING genericpipeline.executable_args:   File "/data/cyril.tasse/DEV_VLBI/lofar_facet_selfcal/facetselfcal.py", line 6463, in <module>
2023-04-14 12:27:16 WARNING genericpipeline.executable_args:     main()
2023-04-14 12:27:16 WARNING genericpipeline.executable_args:   File "/data/cyril.tasse/DEV_VLBI/lofar_facet_selfcal/facetselfcal.py", line 6050, in main
2023-04-14 12:27:16 WARNING genericpipeline.executable_args:     raise KeyError('Encountered invalid option {:s} in config file {:s}.'.format(arg, os.path.abspath('facetselfcal_config.txt')))
2023-04-14 12:27:16 WARNING genericpipeline.executable_args: KeyError: 'Encountered invalid option no_beamcor in config file /data/levangelista/LOFAR_VLBI_PIPELINE/RUN_DIR/Delay-Calibration/delay_solve/facetselfcal_config.txt.'
2023-04-14 12:27:17 WARNING genericpipeline.executable_args: 
2023-04-14 12:27:17 ERROR   genericpipeline: *******************************************
2023-04-14 12:27:17 ERROR   genericpipeline: Failed pipeline run: Delay-Calibration
2023-04-14 12:27:17 ERROR   genericpipeline: Detailed exception information:
2023-04-14 12:27:17 ERROR   genericpipeline: <type 'exceptions.KeyError'>
2023-04-14 12:27:17 ERROR   genericpipeline: 'mapfile'
2023-04-14 12:27:17 ERROR   genericpipeline: *******************************************
2023-04-14 12:27:17 ERROR   genericpipeline: LOFAR Pipeline finished unsuccesfully.
2023-04-14 12:27:17 WARNING genericpipeline: recipe genericpipeline completed with errors

I don't see any real errors - just warnings. I'm fixed the no_beamcor, and not sure if there's an easy solution to the Agg/TkAgg one. Where should I go from there?

A side question: with the generic pipeline, how do I start over from a given pipeline step?

lmorabit commented 1 year ago

when you say you've fixed the no_beamcor what do you mean?

@jwpetley just pushed a fix for this problem.

restarting the generic pipeline from a specific step requires editing the statefile - there is a prefactor tool or if you're old school then you can edit it directly, but that requires being familiar with the statefile. If it's failed on the delay_solve step then you'll probably also have to delete the delay_solve directory in the working directory, if it got as far as creating that; this goes for any step that fails or you want to re-do -- if the output already exists, the pipeline will fail (it won't fail if the step is successful in the statefile).

If the Agg/TkAgg problem is coming from facetselfcal.py then it's best to open an issue there: https://github.com/rvweeren/lofar_facet_selfcal

jwpetley commented 1 year ago

They removed the no_beamcor argument from the facetselfcal.py script entirely. There is now instead an argument beamcor which can be yes, no or auto. I pushed the change so that the lofar-vlbi pipeline is still compatible with the latest version of the selfcal script. It would crash as above otherwise. I think @cyriltasse has seen and done the same fix.

I'm unsure what is going on with the Agg issue. I haven't seen that before. I ran a full run of facetselfcal.py yesterday using the lofar-vlbi pipeline yesterday without issue.

cyriltasse commented 1 year ago

They removed the no_beamcor argument from the facetselfcal.py script entirely. There is now instead an argument beamcor which can be yes, no or auto. I pushed the change so that the lofar-vlbi pipeline is still compatible with the latest version of the selfcal script. It would crash as above otherwise. I think @cyriltasse has seen and done the same fix.

Yes I had seen your fix @jwpetley and changed the code in both the [...]/Delay-Calibration/delay_solve/ and pulling the latest change in lofar-vlbi/master. What's weird is that the pipeline crashes, but the no_beamcor error is quoted as a warning... Actually there are no ERROR errors, apart from the final

2023-04-14 12:27:17 ERROR   genericpipeline: *******************************************
2023-04-14 12:27:17 ERROR   genericpipeline: Failed pipeline run: Delay-Calibration
2023-04-14 12:27:17 ERROR   genericpipeline: Detailed exception information:
2023-04-14 12:27:17 ERROR   genericpipeline: <type 'exceptions.KeyError'>
2023-04-14 12:27:17 ERROR   genericpipeline: 'mapfile'
2023-04-14 12:27:17 ERROR   genericpipeline: *******************************************

restarting the generic pipeline from a specific step requires editing the statefile - there is a prefactor tool or if you're old school then you can edit it directly, but that requires being familiar with the statefile. If it's failed on the delay_solve step then you'll probably also have to delete the delay_solve directory in the working directory, if it got as far as creating that; this goes for any step that fails or you want to re-do -- if the output already exists, the pipeline will fail (it won't fail if the step is successful in the statefile).

Ok - we're getting to close to the week-end, so I'll restart from scratch the delay calibration. Next time I'll try to dig a bit into how things are wired, and restart from last failed step.

Thanks al for all the info. I close this for now, I'll reopen if it crashes after that no_beam fix