glamod / glamod-ingest

Database preparation and ingestion for GLAMOD
BSD 2-Clause "Simplified" License
2 stars 1 forks source link

Update "constraints" package so that it sets constraints for day of month based on the year and month. #17

Closed agstephens closed 3 years ago

agstephens commented 4 years ago

First approach:

  1. Take the counts file
  2. Expand/update the counts by adding a line per day (for each year and month)
agstephens commented 4 years ago

I have implemented a script:

https://github.com/glamod/glamod-cds-forms/tree/master/constraints/add-days-to-counts.py

This writes counts files that include all the days for each month.

Next, need to re-run the minimiser script.

agstephens commented 4 years ago

That approach failed - was running python process for days trying to minimise constraints - which was greater than 4Gb in memory usage.

agstephens commented 4 years ago

Second approach:

  1. Parse the final constraints file.
  2. For each dictionary in the constraints:
    • generate a list of all year/month combinations
    • group them by the number of days in the month
  3. For each group member:
    • create a new entry that also includes:
    • day: [01...]
    • hour: [01...23]
      1. Write them all to a new JSON file.
agstephens commented 4 years ago

Prototyping in:

/usr/local/glamod-cds-forms/constraints/expand-constraints-by-day-hour.py

agstephens commented 3 years ago

Latest actions, all combined here for now:

>>> y = 1700
>>> datetime.datetime.strptime('{}-{}-{} {}'.format(y, 1, 1, 0), '%Y-%m-%d %H').astimezone(UTC)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: offset must be a timedelta representing a whole number of minutes, not datetime.timedelta(-1, 86325).
>>> y=1850
>>> datetime.datetime.strptime('{}-{}-{} {}'.format(y, 1, 1, 0), '%Y-%m-%d %H').astimezone(UTC)
datetime.datetime(1850, 1, 1, 0, 0, tzinfo=datetime.timezone.utc)
>>> datetime.datetime.strptime('{}-{}-{} {}'.format(y, 1, 1, 0), '%Y-%m-%d %H')
datetime.datetime(1850, 1, 1, 0, 0)
>>> datetime.datetime.strptime('{}-{}-{} {}'.format(1800, 1, 1, 0), '%Y-%m-%d %H')
datetime.datetime(1800, 1, 1, 0, 0)
>>> datetime.datetime.strptime('{}-{}-{} {}'.format(1700, 1, 1, 0), '%Y-%m-%d %H')
datetime.datetime(1700, 1, 1, 0, 0)
>>> datetime.datetime.strptime('{}-{}-{} {}'.format(y, 1, 1, 0), '%Y-%m-%d %H').astimezone(UTC)
datetime.datetime(1850, 1, 1, 0, 0, tzinfo=datetime.timezone.utc)
agstephens commented 3 years ago

This has been improved/fixed in commit: https://github.com/glamod/glamod-cds-forms/commit/dd849253826e7bd55dd9edf938b8f5166175f0b4