Starlink / starlink

Starlink Software Collection
162 stars 53 forks source link

History text lost on conversion to FITS and back causing PROVSHOW to segfault #32

Open grahambell opened 10 years ago

grahambell commented 10 years ago

ORAC-DR writes SCUBA-2 JSA tiles with history information:

prov-problem-in $ hislist s20140301_00007_850_healpix001244.sdf
...
2: 2014 Apr 29 00:08:40.682 - SETTITLE        (KAPPA 2.1-12)

   Parameters: NDF=@s20140301_00007_850_fmos_1244 TITLE='CRL618'
      MSG_FILTER=! QUIET=!
   Software: /net/ipu/export/data/gbell/star/bin/kappa/ndfpack_mon

And jsawrapdr converts it to FITS including the history:

prov-problem-in $ fitshead jcmts20140301_00007_850_healpix001244_obs_000.fits
...
HISTORY 2: 2014 Apr 29 00:08:40.682 - SETTITLE        (KAPPA 2.1-12)
HISTORY User: gbell   Host: ikaika  Width: 72
HISTORY Dataset: /export/data/visitors/gbell/scratch19/aIN5sFKqj5/s20140301_0000
HISTORY 7_85...
HISTORY Parameters: NDF=@s20140301_00007_850_fmos_1244 TITLE='CRL618'
HISTORY    MSG_FILTER=! QUIET=!
HISTORY Software: /net/ipu/export/data/gbell/star/bin/kappa/ndfpack_mon

Then jsawrapdr converts it back to SDF and the history is missing:

prov-problem-out $ hislist s20140301_00007_850_healpix001238.sdf
...
2: 2014 Apr 28 23:51:05.076 - SETTITLE        (KAPPA 2.1-12)

(and PROVSHOW works on that file).

However when ORAC-DR processes that file, I get a file, also without history:

prov-problem-out $ hislist gs850um_healpix001244.sdf
...
2: 2014 Apr 28 23:51:05.218 - SETTITLE        (KAPPA 2.1-12)

But now PROVSHOW segfaults:

prov-problem-out $ provshow gs850um_healpix001244.sdf
Segmentation fault (core dumped)

(Which prevents jsawrapdr from post-processing that file.)

I was able to put an if statement around the line which I identified with gdb as causing the segfault, so I have the following, which works:

ndg_provenance.c:
3169          if (hrec->text) {
3170             astMapPut0C( kmrec, TEXT_NAME, hrec->text, NULL );
3171          }
3172          else {
3173             astMapPut0C( kmrec, TEXT_NAME, "", NULL );
3174          }

But I'm not sure whether HistRec.text is supposed to be allowed to be NULL or not. If it is then I could commit this change, but if it's supposed to be guaranteed to not be NULL then this isn't the right fix.

timj commented 10 years ago

My inclination is that this is not meant to happen in the sense that a roundtrip is meant to get you back where you started. ndf2fits->fits2ndf is meant to be lossless (and @MalcolmCurrie works hard for that to be true). It still sounds like your patch is a good safety net.

Note that NDF does not start history recording if there is no history structure so to enable history recording ORAC-DR (presumably picard) would have to enable history on the file (and it shouldn't do that on the input file as we probably shouldn't be modifying files given to the pipeline).

MalcolmCurrie commented 10 years ago

Since I have other jobs to complete first this week, I asked Graham to post this as an issue. Some change has derailed the history propagation back from FITS to NDF. It used to work, and I certainly want it going again.

dsberry commented 10 years ago

@grahambell Can you send me, or tell me where to find, the gs850um_healpix001244.sdf file that causes provshow to seg fault?

dsberry commented 10 years ago

Commit 9769826f3 should fix the segfault in provshow. But there is still the issue of why fits2ndf creates blank history text.

MalcolmCurrie commented 10 years ago

This is ironic. FITSIO has changed the way it handles long history headers. It now writes continuation lines in 70-character blocks. In hindsight it would have been better to have dealt with this in COF_WHISR (now CVG_WHISR), making my own paragraphs for all not just some of the history components like Parameters, Arguments, and Group. We're falling foul of long names for the Dataset, Software, and Command elements.

I propose to modify CVG_WHISR to make paragraghs for these elements rather than leave it to FITSIO, which might change, This won't affect old NDF2FITS-created HISTORY, because it's truncated. This change has the advantage of restoring all the information. At present the cycle can be lossy.

I can then make COF_CHISR recognise the continuation lines for these additional elements and recover the full names for those of you who like very long paths.

MalcolmCurrie commented 10 years ago

A second factor was the buffer size for the text records. The SMURF configuration list and group parameters when running MAKEMAP can required several thousand characters. For now I've upped the buffer size, but some dynamic allocation would be better.

MalcolmCurrie commented 10 years ago

Partial fixes are in af56e5ad923f5e609d46ac9ff78df5cb51357cdf and 4acde70891d1b43df9a0b544be7c1ecf1f8ac483 (not the original 32f0666f8ed08 as stated in a commit). There is a change needed to NDF2FITS too, so not closing this issue yet.

MalcolmCurrie commented 10 years ago

I've made a change to NDF2FITS too (0b4b02c1ab61397476b9208de114e3775fa18c5d). @grahambell please reprocess from the NDF (not the FITS) to check that the round trips are now correct again.

timj commented 10 years ago

So can this ticket be closed?

MalcolmCurrie commented 10 years ago

I just wanted confirmation from Graham, but I believe it can be closed. The history records are not quite identical in terms of indentation, but should have the same content.

grahambell commented 10 years ago

David's fix prevented the segfault I was encountering, so the main issue is fixed. The history looked mostly OK as far as I remember -- I'm not sure if the colons in Perl module names were still being a problem or not.