aodn / content

Tracks AODN Portal content and configuration issues
0 stars 0 forks source link

SRS-SST harvester - inewressources issues #83

Closed lbesnard closed 9 years ago

lbesnard commented 9 years ago

@rogerproctor noticed some problems while aggregating SRS data with gogoduck and sent an email to @danfruehauf and @kereid about it. Gogoduck tried to aggregate data (which exists on NcWMS) for the 23 March 2015. The file exists here : http://thredds.aodn.org.au/thredds/catalog/srs/sst/ghrsst/L3S-1d/ngt/2015/catalog.html and should have been harvested by the srs_sst harvester.

Looking at dbprod, and running the following pgsql query, it showed that the file was missing from the view and table :

SELECT * from srs_sst.srs_sst_l3s_1d_ngt_gridded_url where time > '2015-03-10 00:00:00'
SELECT *  FROM srs_sst.indexed_file t where  url::text ~~ '%20150323152000-ABOM-L3S_GHRSST-SSTskin-AVHRR_D-1d_night-v02.0-fv01.0.nc%'

The next investigation was to look at the harvester's log file. The latter showed some interesting outputs for many files. Interestingly, the files were not re-harvested according to the newer log files.

sed -n '101000,101030p' < /mnt/ebs/talend/jobs/srs_sst-srs_sst/log/console.2015-03-17-15-00-01.log

>
* iNewResources new/readded resource file_id=100664
* iNewResources new/readded resource file_id=100665
* iNewResources new/readded resource file_id=100666
Mar 18, 2015 5:24:55 AM ucar.nc2.iosp.hdf5.H5header$MessageAttribute read
SEVERE: bad version 72 at filePos 74542110
* iNewResources new/readded resource file_id=100667
Mar 18, 2015 5:24:55 AM ucar.nc2.iosp.hdf5.H5header$MessageAttribute read
SEVERE: bad version 72 at filePos 80411175
* iNewResources new/readded resource file_id=100668
Mar 18, 2015 5:24:56 AM ucar.nc2.iosp.hdf5.H5header$MessageAttribute read
SEVERE: bad version 72 at filePos 85250704
* iNewResources new/readded resource file_id=100669
Mar 18, 2015 5:24:56 AM ucar.nc2.iosp.hdf5.H5header$MessageAttribute read
SEVERE: bad version 72 at filePos 84991000
* iNewResources new/readded resource file_id=100670
Mar 18, 2015 5:24:56 AM ucar.nc2.iosp.hdf5.H5header$MessageAttribute read
SEVERE: bad version 72 at filePos 86038557
* iNewResources new/readded resource file_id=100671
Mar 18, 2015 5:24:57 AM ucar.nc2.iosp.hdf5.H5header$MessageAttribute read
SEVERE: bad version 72 at filePos 79761142
* iNewResources new/readded resource file_id=100672
Mar 18, 2015 5:24:57 AM ucar.nc2.iosp.hdf5.H5header$MessageAttribute read
SEVERE: bad version 72 at filePos 74002940
* iNewResources new/readded resource file_id=100673
Mar 18, 2015 5:24:57 AM ucar.nc2.iosp.hdf5.H5header$MessageAttribute read
SEVERE: bad version 72 at filePos 71093248
* iNewResources new/readded resource file_id=100674
Mar 18, 2015 5:24:58 AM ucar.nc2.iosp.hdf5.H5header$MessageAttribute read
SEVERE: bad version 72 at filePos 69070142
* iNewResources new/readded resource file_id=100675
* iNewResources new/readded resource file_id=100676

I then ran this harvester on my machine and local DB, on the 2015 folder only. I didn't get any errors. The files were all harvested.

Anyone has an idea. Are those files corrupted ? @jonescc

anguss00 commented 9 years ago

That error looks to me as though we have corrupted files @lbesnard. Have you tried harvesting those exact files locally (ie copying them down from 10-nsp and running against them locally)?

lbesnard commented 9 years ago

I run the harvester on my machine through a sshfs mount, so exactly on those files

jonescc commented 9 years ago

Looking at dbprod, 20150323152000-ABOM-L3S_GHRSST-SSTskin-AVHRR_D-1d_night-v02.0-fv01.0.nc was first indexed at "2015-03-25 18:38:02.294013" i.e. a couple of hours after you raised this issue. Its modification date (the date it was copied to production?) was "2015-03-25 12:36:47".

Not sure what files correspond to the errors reported in the logs - there's not enough information provided. All the id's mentioned have been harvested successfully. Its possible the files were harvested successfully in a later run under another id (the new/readded message doesn't look like it would be output if there's an error).

The name of the file for which the error occurred should be reported, but it doesn't look likek it is for iNewResources. This should be fixed.

Do you have the name of any files apart from the one that hadn't been processed that have been missed?

jonescc commented 9 years ago

Looking at: https://github.com/Unidata/thredds/blob/910db13d3ff90d8910e32b7421cbbfc9d3dad487/cdm/src/main/java/ucar/nc2/iosp/hdf5/H5header.java#L3737

Mar 18, 2015 5:24:57 AM ucar.nc2.iosp.hdf5.H5header$MessageAttribute read
SEVERE: bad version 72 at filePos 71093248

is NOT an exception, so the the file would have been been marked as processed by the harvester, so the id reported after it should be for that file. You can check those file id's yourself to see that they have been processed correctly.

jonescc commented 9 years ago

PS. Note next line in the referenced source

// G:/work/galibert/IMOS_ANMN-NSW_AETVZ_20131127T230000Z_PH100_FV01_PH100-1311-Workhorse-ADCP-109.5_END-20140306T010000Z_C-20140521T053527Z.nc

@ggalibert may be able to shed some light on the "SEVERE" error here which doesn't stop processing of the file.

ggalibert commented 9 years ago

Well,

this file comes from ANMN velocity timeseries, it shouldn't be harvested by SRS-SST harvester... What happened @lbesnard ?

jonescc commented 9 years ago

@ggalibert Error happened for an srs_sst file. But I thought you may be able to explain what the error message means and why it doesn't stop processing of the file.

ggalibert commented 9 years ago

I have no idea... To me, a netCDF file is not corrupted as long as ncdump doesn't throw any error when applied to it.

lbesnard commented 9 years ago

// G:/work/galibert/IMOS_ANMN-NSW_AETVZ_20131127T230000Z_PH100_FV01_PH100-1311-Workhorse-ADCP-109.5_END-20140306T010000Z_C-20140521T053527Z.nc

as a heads up, this is just a remnant part of an unused context. So this is not 'clean', but this file is not used at all by the harvester, nor harvested

ggalibert commented 9 years ago

For what it's worth I've just tested this file and it is not corrupted.

jonescc commented 9 years ago

OK, so I picked one of the files with from the log above where errors were being reported.

http://thredds.aodn.org.au/thredds/fileServer/srs/sst/ghrsst/L3C-3d/day/n11/1994/19940210032000-ABOM-L3C_GHRSST-SSTskin-AVHRR11_D-3d_day-v02.0-fv02.0.nc

I can open this in toolsui version 4.6 but in version 4.5 I get

Mar 27, 2015 12:11:50 PM ucar.nc2.iosp.hdf5.H5header$MessageAttribute read
SEVERE: bad version 72 at filePos 85250711

toolsui would be using the netcdf java libraries where ncdump probably isn't? 4.6 is the latest release (thredds 4.6 is currently in beta) and wouldn't be being used by the harvester.

So to summarise,

a) I believe the errors logged to the console did not prevent these files from being processed (there's a timeseries record for this file - 100669 - in production). I'd check that the harvest did indeed work as expected for this file otherwise we'll perhaps need to look at upgrading to the latest version of the netcdf java library.

b) the missing 23/3/15 files were due to the time lag between files being copied to production and the harvester running - perhaps the harvester can be run more regularly

lbesnard commented 9 years ago

I think we're all good to close this issue @jonescc , aren't we ?