FIDUCEO / FCDR_HIRS

Other
1 stars 2 forks source link

fragmentated into too many pieces #155

Open gerritholl opened 6 years ago

gerritholl commented 6 years ago

There is a bug somewhere in the code that fragmentates a piece into equator-to-equator. For example, consider NOAA-12 1994-01-01, where the directory contains 100 orbits instead of the usual 14-15. See /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.7/easy/noaa12/1994/01/01. I suspect this may be a knock-on effect of geolocation issues.

gerritholl commented 6 years ago

See also #162

gerritholl commented 6 years ago

See also #170.

gerritholl commented 6 years ago

This is a consequence to clearly wrong geolocation, with latitude outliers of -34.14 in the middle of scanlines clearly nowhere near latitude -34.14. Fixing #235 would prevent this problem from occurring, but I may need to rethink my splitting algorithm before I get around to redoing the geolocation.

In : piece["lat"].sel(scanpos=28).isel(time=slice(1120,1300))
<xarray.DataArray 'lat' (time: 180)>
array([ 76.117188,  76.40625 ,  76.695312,  76.976562,  77.257812,  77.523438,
        77.789062,  78.054688,  78.304688,  78.554688,  78.796875,  79.03125 ,
        79.25    , -34.140625,  79.695312,  79.890625,  80.078125,  80.25    ,
        80.414062,  80.570312,  80.703125,  80.828125,  80.945312,  81.039062,
        81.125   ,  81.1875  ,  81.242188,  81.273438,  81.296875,  81.296875,
        81.28125 ,  81.257812,  81.210938, -34.140625,  81.070312,  80.976562,
        80.875   ,  80.75    ,  80.617188,  80.46875 ,  80.3125  ,  80.140625,
        79.960938,  79.765625,  79.5625  ,  79.351562,  79.132812,  78.898438,
        78.664062,  78.421875,  78.171875,  77.914062,  77.648438, -34.140625,
        77.101562,  76.820312,  76.53125 ,  76.25    ,  75.953125,  75.65625 ,
        75.359375,  75.054688,  74.742188,  74.4375  ,  74.125   ,  73.804688,
        73.484375,  73.164062,  72.84375 ,  72.515625,  72.195312,  71.859375,
        71.53125 , -34.140625,  70.867188,  70.53125 ,  70.1875  ,  69.851562,
        69.515625,  69.171875,  68.828125,  68.484375,  68.140625,  67.796875,
        67.445312,  67.101562,  66.75    ,  66.40625 ,  66.054688,  65.703125,
        65.351562,  65.      ,  64.648438, -34.140625,  63.9375  ,  63.578125,
        63.226562,  62.867188,  62.515625,  62.15625 ,  61.796875,  61.4375  ,
        61.078125,  60.71875 ,  60.359375,  60.      ,  59.640625,  59.28125 ,
        58.914062,  58.554688,  58.195312,  57.828125,  57.46875 , -34.140625,
        56.742188,  56.375   ,  56.015625,  55.648438,  55.28125 ,  54.921875,
        54.554688,  54.1875  ,  53.820312,  53.453125,  53.085938,  52.726562,
        52.359375,  51.992188,  51.625   ,  51.25    ,  50.882812,  50.515625,
        50.148438, -34.140625,  49.414062,  49.046875,  48.679688,  48.3125  ,
        47.9375  ,  47.570312,  47.203125,  46.828125,  46.460938,  46.09375 ,
        45.71875 ,  45.351562,  44.976562,  44.609375,  44.234375,  43.867188,
        43.492188,  43.125   ,  42.75    , -34.140625,  42.015625,  41.648438,
        41.273438,  40.898438,  40.53125 ,  40.15625 ,  39.78125 ,  39.414062,
        39.039062,  38.664062,  38.289062,  37.921875,  37.546875,  37.171875,
        36.796875,  36.421875,  36.054688,  35.679688,  35.304688, -34.140625,
        34.570312,  34.195312,  33.820312,  33.445312,  33.070312,  32.695312])
Coordinates:
    scanpos   int64 28
  * time      (time) datetime64[ns] 1994-01-01T00:01:23.328000 ...
    scanline  (time) int64 6745 6746 6747 6748 6749 6750 6751 6752 6753 6754 ...
    lon       (time) float64 -106.7 -107.8 -108.8 -110.0 -111.1 -112.4 ...
    lat       (time) float64 76.12 76.41 76.7 76.98 77.26 77.52 77.79 78.05 ...
Attributes:
    long_name:          latitude
    units:              degrees_north
    valid_range:        [-90, 90]
    source:             copied from NOAA native L1B format
    original_name_l1b:  lat
gerritholl commented 6 years ago

/group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/debug/noaa14/1995/03/02/ contains 694 files.

gerritholl commented 6 years ago

/group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa11/1991/04/14/ contains 1755 files.

gerritholl commented 6 years ago

To find directories with a lot of files (for example, the top-30):

grep '/group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy' filelist_20180816 | egrep '(/.*){12,}' | sed 's|/[^/]*$|/|g' | sort | uniq -c | sort -rn | head -n 30

where filelist_20180816 is created with

find /group_workspaces/cems2/fiduceo/  -xdev > filelist_20180816
gerritholl commented 6 years ago

There are 35 days with more than 15 orbit files, including 15 with more than 20 and 6 with more than 100.

It happens only for noaa14 and before.

 1     1755 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa11/1991/04/14/
 2      694 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa14/1995/03/02/
 3      386 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/tirosn/1980/01/13/
 4      331 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa11/1991/04/15/
 5      164 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa06/1980/11/24/
 6      114 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa12/1994/03/02/
 7       98 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa12/1994/01/01/
 8       38 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa11/1988/12/15/
 9       29 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa09/1985/04/09/
10       25 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa06/1980/03/03/
11       23 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa07/1983/11/30/
12       22 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa08/1985/10/02/
13       22 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa07/1981/12/31/
14       20 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa08/1983/06/05/
15       20 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa06/1980/05/01/
16       18 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa10/1991/04/17/
17       18 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa08/1984/02/23/
18       18 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa06/1982/01/28/
19       18 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa06/1980/10/18/
20       17 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa09/1986/06/01/
21       17 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa07/1982/07/09/
22       17 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa06/1982/08/15/
23       17 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa06/1980/03/18/
24       17 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa06/1979/07/07/
25       16 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/tirosn/1979/10/16/
26       16 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa10/1989/07/05/
27       16 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa08/1984/02/20/
28       16 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa08/1984/02/19/
29       16 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa08/1983/11/20/
30       16 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa07/1984/08/04/
31       16 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa07/1983/04/27/
32       16 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa07/1982/06/23/
33       16 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa06/1982/07/01/
34       16 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa06/1979/08/04/
35       16 /group_workspaces/cems2/fiduceo/Data/FCDR/HIRS/v0.8pre/easy/noaa06/1979/08/01/
gerritholl commented 6 years ago

This can be compared with 135159 satellite-days in total, so it happens in just 0.026% of days.

gerritholl commented 6 years ago

Conclusion, for now this needs to be taken care of as part of #274.