aodn / content

Tracks AODN Portal content and configuration issues
0 stars 0 forks source link

GSLA - Different grids for GSLA_NRT files prevents aggregations #427

Open jonescc opened 5 years ago

jonescc commented 5 years ago

Refer https://github.com/aodn/issues/issues/445

ggalibert commented 5 years ago

@lbesnard could you please have a look and see if you can find out which files are inconsistent? In the year 2012: http://thredds.aodn.org.au/thredds/catalog/IMOS/OceanCurrent/GSLA/NRT00/2012/catalog.html LONGITUDE should be of length 641 yet there are a few files with length 640 only. Once identified we could ask Madeleine to re-process them.

lbesnard commented 5 years ago

@ggalibert I ran a script on all 2012 only

Here are the affected files:

IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121022T000000Z_GSLA_FV02_NRT00_C-20121028T220021Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121023T000000Z_GSLA_FV02_NRT00_C-20121030T022345Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121024T000000Z_GSLA_FV02_NRT00_C-20121030T023043Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121025T000000Z_GSLA_FV02_NRT00_C-20121029T220054Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121026T000000Z_GSLA_FV02_NRT00_C-20121030T215945Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121027T000000Z_GSLA_FV02_NRT00_C-20121031T220011Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121028T000000Z_GSLA_FV02_NRT00_C-20121101T220002Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121029T000000Z_GSLA_FV02_NRT00_C-20121109T014702Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121030T000000Z_GSLA_FV02_NRT00_C-20121109T015352Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121031T000000Z_GSLA_FV02_NRT00_C-20121109T020132Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121101T000000Z_GSLA_FV02_NRT00_C-20121109T020815Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121102T000000Z_GSLA_FV02_NRT00_C-20121109T021506Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121103T000000Z_GSLA_FV02_NRT00_C-20121107T215955Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121104T000000Z_GSLA_FV02_NRT00_C-20121108T220044Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121105T000000Z_GSLA_FV02_NRT00_C-20121109T220024Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121106T000000Z_GSLA_FV02_NRT00_C-20121110T220028Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121107T000000Z_GSLA_FV02_NRT00_C-20121111T220115Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121108T000000Z_GSLA_FV02_NRT00_C-20121112T220149Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121109T000000Z_GSLA_FV02_NRT00_C-20121113T220103Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121110T000000Z_GSLA_FV02_NRT00_C-20121114T220122Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121111T000000Z_GSLA_FV02_NRT00_C-20121115T220134Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121112T000000Z_GSLA_FV02_NRT00_C-20121116T220229Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121113T000000Z_GSLA_FV02_NRT00_C-20121117T220120Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121114T000000Z_GSLA_FV02_NRT00_C-20121118T220144Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121115T000000Z_GSLA_FV02_NRT00_C-20121120T220237Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121117T000000Z_GSLA_FV02_NRT00_C-20121121T220126Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121118T000000Z_GSLA_FV02_NRT00_C-20121122T220110Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121119T000000Z_GSLA_FV02_NRT00_C-20121123T220214Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121120T000000Z_GSLA_FV02_NRT00_C-20121124T220352Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121121T000000Z_GSLA_FV02_NRT00_C-20121125T220359Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121122T000000Z_GSLA_FV02_NRT00_C-20121126T220042Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121123T000000Z_GSLA_FV02_NRT00_C-20121127T220045Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121124T000000Z_GSLA_FV02_NRT00_C-20121128T220047Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121125T000000Z_GSLA_FV02_NRT00_C-20121129T220011Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121126T000000Z_GSLA_FV02_NRT00_C-20121130T220107Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121127T000000Z_GSLA_FV02_NRT00_C-20121202T235003Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121128T000000Z_GSLA_FV02_NRT00_C-20121202T234838Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121129T000000Z_GSLA_FV02_NRT00_C-20121203T220142Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121130T000000Z_GSLA_FV02_NRT00_C-20121204T220100Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121201T000000Z_GSLA_FV02_NRT00_C-20121205T220235Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121202T000000Z_GSLA_FV02_NRT00_C-20121206T220133Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121203T234131Z_GSLA_FV02_NRT00_C-20121210T063216Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121204T235941Z_GSLA_FV02_NRT00_C-20121210T221139Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121205T000000Z_GSLA_FV02_NRT00_C-20121211T003204Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121206T000000Z_GSLA_FV02_NRT00_C-20121210T220209Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121207T000000Z_GSLA_FV02_NRT00_C-20121211T220202Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121208T000000Z_GSLA_FV02_NRT00_C-20121213T220247Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121209T000000Z_GSLA_FV02_NRT00_C-20121214T221038Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121210T000000Z_GSLA_FV02_NRT00_C-20121214T220125Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121211T000000Z_GSLA_FV02_NRT00_C-20121216T013846Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121212T000000Z_GSLA_FV02_NRT00_C-20121216T220030Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121213T000000Z_GSLA_FV02_NRT00_C-20121217T220038Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121214T000000Z_GSLA_FV02_NRT00_C-20121218T220035Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121215T000000Z_GSLA_FV02_NRT00_C-20121219T220055Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121216T000000Z_GSLA_FV02_NRT00_C-20121220T220036Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121217T000000Z_GSLA_FV02_NRT00_C-20121221T220217Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121218T000000Z_GSLA_FV02_NRT00_C-20121222T220114Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121219T000000Z_GSLA_FV02_NRT00_C-20121223T220116Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121220T000000Z_GSLA_FV02_NRT00_C-20121224T220031Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121221T000000Z_GSLA_FV02_NRT00_C-20121225T220040Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121222T000000Z_GSLA_FV02_NRT00_C-20121226T220037Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121223T000000Z_GSLA_FV02_NRT00_C-20121227T220010Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121224T000000Z_GSLA_FV02_NRT00_C-20121228T220026Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121225T000000Z_GSLA_FV02_NRT00_C-20121229T215955Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121226T000000Z_GSLA_FV02_NRT00_C-20121230T215957Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121227T000000Z_GSLA_FV02_NRT00_C-20121231T220024Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121228T000000Z_GSLA_FV02_NRT00_C-20130101T215950Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121229T000000Z_GSLA_FV02_NRT00_C-20130102T215952Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121230T000000Z_GSLA_FV02_NRT00_C-20130104T220841Z.nc.gz
IMOS/OceanCurrent/GSLA/NRT00/2012/IMOS_OceanCurrent_HV_20121231T000000Z_GSLA_FV02_NRT00_C-20130104T220031Z.nc.gz
ggalibert commented 5 years ago

Please check on all years/all files for length(LONGITUDE) != 641 or length(LATITUDE) != 351 just in case and advise Madeleine of the problem with a full list of inconsistent files.

lbesnard commented 5 years ago

for reference

i ran

s3fs imos-data imos-data/ -o public_bucket=1,umask=000

and then (files in nc.gz create some complication with xarray and s3fs within python)

import xarray as xr
from os.path import join
import os

from pathlib import Path

mypath = join(os.environ.get('HOME'), 
              'imos-data/IMOS/OceanCurrent/GSLA/NRT00/')

gsla_files = list(Path(mypath).rglob("*.nc.gz"))

for f in gsla_files:
    with xr.open_dataset(f, engine='scipy') as aa:
        if aa.UCUR.shape[2] != 641 or aa.UCUR.shape[1] != 351 :
            with open('/tmp/bad.txt', 'a') as out:
                out.write(f.as_posix())
                out.write("\n")

The good news is that I couldn't find any other bad data outside of the year 2012

ocehugo commented 4 years ago

Given the data is regular and versioned (NRT00), it should have a more strict schema to be accepted on the pipeline. This way the operator is aware of it automatically when uploading and can fix in place.

ggalibert commented 4 years ago

Response from Madeleine:

Now that I am updating the DM GSLA about 30 days behind real time there is no reason to keep the NRT in year files or as a data resource. They are incomplete in comparison to the DM files because of the delay in getting some of the incoming data. I think we should get rid of the NRT year files and all NRT files prior to 2019.

lbesnard commented 2 years ago

@ggalibert shall we close this issue and remove NRT files prior to a certain date to be decided since the data is available is one of the DM collection?