MatthewFlamm / pynws

⛈️ A python library to asynchronously retrieve weather observation from NWS/NOAA
MIT License
33 stars 10 forks source link

Retrieve any forecast layer for arbitrary datetimes #66

Closed lymanepp closed 2 years ago

lymanepp commented 2 years ago

Here's my initial prototype of supporting retrieval of any forecast layer for arbitrary datetimes. The get_forecast_all() method should be changed to return a Forecast instance.

import aiohttp
import asyncio
import pynws
from datetime import datetime
from scipy.interpolate import interp1d

COORD = (30.022979, -84.982518)
USERID = "president@whitehouse.gov"

class Forecast:
    def __init__(self, data, interpolation="linear"):
        if not isinstance(data, dict):
            raise ValueError(f"'data' must be a dictionary")

        self._data = data
        self._interpolation = interpolation
        self._functions = {}

    @staticmethod
    def _parse_iso_8601_time(time_string):
        return datetime.fromisoformat(time_string.split("/")[0])

    def _make_function(self, layer):
        values = self._data[layer]["values"]

        if not isinstance(values, list):
            raise ValueError(f"'{layer}' is not a valid layer")

        x = [self._parse_iso_8601_time(value["validTime"]).timestamp() for value in values]
        y = [float(value["value"]) for value in values]

        return interp1d(x, y, kind=self._interpolation, copy=False, assume_sorted=True)

    def _get_function(self, layer):
        if layer in self._functions:
            return self._functions[layer]

        func = self._functions[layer] = self._make_function(layer)
        return func

    def _get_value(self, layer, when):
        if not isinstance(when, datetime):
            raise ValueError(f"'when' must be a datetime")

        value = self._get_function(layer)(when.timestamp())
        return round(float(value), 2)

    def get_temperature(self, when):
        return self._get_value("temperature", when)

    def get_dewpoint(self, when):
        return self._get_value("dewpoint", when)

    def get_humidity(self, when):
        return self._get_value("relativeHumidity", when)

async def example():
    async with aiohttp.ClientSession() as session:
        nws = pynws.Nws(session, USERID, latlon=COORD)
        forecast_data = await nws.get_forecast_all()

    now = datetime.now()
    for kind in ("linear", "zero", "slinear", "quadratic", "cubic"):
        forecast = Forecast(forecast_data, interpolation=kind)
        temperature = forecast.get_temperature(now)
        dewpoint = forecast.get_dewpoint(now)
        humidity = forecast.get_humidity(now)

        print(now, kind, temperature, dewpoint, humidity)

loop = asyncio.get_event_loop()
loop.run_until_complete(example())
lymanepp commented 2 years ago

Numpy could be used as a fallback (linear interpolation only) when scipy isn't installed.

lymanepp commented 2 years ago

Here's a version that uses numpy instead of scipy. That would be better for use in Home Assistant as numpy is already used by HA core.

import aiohttp
import asyncio
import pynws
from datetime import datetime
from numpy import interp

COORD = (30.022979, -84.982518)
USERID = "president@whitehouse.gov"

class Forecast:
    def __init__(self, data):
        if not isinstance(data, dict):
            raise ValueError(f"'data' must be a dictionary")

        self._data = data
        self._layer_values = {}

    @staticmethod
    def _parse_iso_8601_time(time_string):
        return datetime.fromisoformat(time_string.split("/")[0])

    def _format_layer_values(self, layer):
        values = self._data[layer]["values"]

        if not isinstance(values, list):
            raise ValueError(f"'{layer}' is not a valid layer")

        xp = [self._parse_iso_8601_time(value["validTime"]).timestamp() for value in values]
        fp = [float(value["value"]) for value in values]

        return (xp, fp)

    def _get_layer_values(self, layer):
        if layer not in self._layer_values:
            self._layer_values[layer] = self._format_layer_values(layer)

        return self._layer_values[layer]

    def _get_value(self, layer, when):
        if not isinstance(when, datetime):
            raise ValueError(f"'when' must be a datetime")

        xp, fp = self._get_layer_values(layer)
        value = interp([when.timestamp()], xp, fp)
        return round(float(value), 2)

    def get_temperature(self, when):
        return self._get_value("temperature", when)

    def get_dewpoint(self, when):
        return self._get_value("dewpoint", when)

    def get_humidity(self, when):
        return self._get_value("relativeHumidity", when)

async def example():
    async with aiohttp.ClientSession() as session:
        nws = pynws.Nws(session, USERID, latlon=COORD)
        forecast_data = await nws.get_forecast_all()

    now = datetime.now()
    forecast = Forecast(forecast_data)
    temperature = forecast.get_temperature(now)
    dewpoint = forecast.get_dewpoint(now)
    humidity = forecast.get_humidity(now)

    print(now, temperature, dewpoint, humidity)

loop = asyncio.get_event_loop()
loop.run_until_complete(example())
MatthewFlamm commented 2 years ago

I'm not sure that we need to interpolate, or at least shouldn't require these as dependencies. This page https://weather-gov.github.io/api/gridpoints says that the time stamp includes both the beginning time and how long it is valid for. We could just return the value if it is in the interval.

lymanepp commented 2 years ago

Hmm, I've always used interpolation but maybe I'm doing it wrong. It's simpler if interpolation is removed.

MatthewFlamm commented 2 years ago

However I'm not saying it isn't useful, just that I'm not sure it makes sense to make it a hard dependency for this library.

MatthewFlamm commented 2 years ago

There is also the possiblity of allowing it as an optional dependency, which might make sense.

Edit: Maybe as simple as having the default be interpolation=None and just returning the value in the interval. And then raise ImportError if user inputs something else for interpolation but does not have those libraries available.

lymanepp commented 2 years ago

Here's another prototype that just uses the ranges.

import aiohttp
import asyncio
import pynws
import re
from datetime import datetime, timedelta, timezone

COORD = (30.022979, -84.982518)
USERID = "president@whitehouse.gov"

ISO8601_PERIOD_REGEX = re.compile(
    r"^P"
    r"((?P<weeks>\d+)W)?"
    r"((?P<days>\d+)D)?"
    r"((?:T)"
        r"((?P<hours>\d+)H)?"
        r"((?P<minutes>\d+)M)?"
        r"((?P<seconds>\d+)S)?"
    r")?$"
)

class Forecast:
    def __init__(self, data):
        if not isinstance(data, dict):
            raise ValueError(f"'data' must be a dictionary")

        self._raw_data = data
        self._layers = {}

    @staticmethod
    def _parse_duration(duration_str):
        match = ISO8601_PERIOD_REGEX.match(duration_str)
        groups = match.groupdict()

        for key, val in groups.items():
            groups[key] = int(val or "0")

        return timedelta(
            weeks=groups["weeks"],
            days=groups["days"],
            hours=groups["hours"],
            minutes=groups["minutes"],
            seconds=groups["seconds"],
        )

    def _get_layer_values(self, layer):
        if layer in self._layers:
            return self._layers[layer]

        raw_layer = self._raw_data[layer]
        layer_values = []

        for value in raw_layer["values"]:
            isodatetime, duration_str = value["validTime"].split("/")
            start_time = datetime.fromisoformat(isodatetime)
            end_time = start_time + self._parse_duration(duration_str)
            layer_values.append((start_time, end_time, float(value["value"])))

        retval = self._layers[layer] = (layer_values, raw_layer["uom"])
        return retval

    def _get_value(self, layer, when):
        if not isinstance(when, datetime):
            raise ValueError(f"'when' must be a datetime")

        when = when.astimezone(timezone.utc)
        layer_values, units = self._get_layer_values(layer)

        for start_time, end_time, value in layer_values:
            if start_time <= when < end_time:
                # TODO: convert value to metric/imperial units (configurable) instead of exposing
                return (value, units)

        raise IndexError(f"{when} is not available in this forecast")

    def get_temperature(self, when):
        return self._get_value("temperature", when)

    def get_dewpoint(self, when):
        return self._get_value("dewpoint", when)

    def get_humidity(self, when):
        return self._get_value("relativeHumidity", when)

async def example():
    async with aiohttp.ClientSession() as session:
        nws = pynws.Nws(session, USERID, latlon=COORD)
        forecast_data = await nws.get_forecast_all()

    now = datetime.now()
    forecast = Forecast(forecast_data)
    temperature = forecast.get_temperature(now)
    dewpoint = forecast.get_dewpoint(now)
    humidity = forecast.get_humidity(now)

    print(now, temperature, dewpoint, humidity)

loop = asyncio.get_event_loop()
loop.run_until_complete(example())
lymanepp commented 2 years ago

Note that I used datetime.fromisoformat() which is only available in Python 3.7+. But all versions of Python prior to that are no longer supported--https://en.wikipedia.org/wiki/History_of_Python#Table_of_versions

MatthewFlamm commented 2 years ago

Note that I used datetime.fromisoformat() which is only available in Python 3.7+.

I've started #67 to formally require >=3.7

lymanepp commented 2 years ago

Thanks Matthew!

lymanepp commented 2 years ago

I went ahead and submitted a PR to get things started

lymanepp commented 2 years ago

Hi @MatthewFlamm

Before you release these changes... I'd like to consider the naming of the new things in this PR.

Perhaps this could be better. Maybe like this:

Future API changes could have the following:

Also, could review the names on the Forecast class:

And finally, consider a better name than "layer" to describe forecast elements. That name is closely coupled to the NWS API. And some of the "layers" could be applicable to "SimpleForecast" if/when that is implemented.

Thoughts?

MatthewFlamm commented 2 years ago

I like detailed_forecast and DetailedForecast naming. This should be done prior to the next release.

I like the idea of moving towards daily_forecast and hourly_forecast, to differentiate better. I'm not sure, but maybe best to implement when introducing a SimpleForecast class to keep major changes to existing API colocated. Or the existing method names could be deprecated first.

I don't mind the current method names, but if you have suggestions feel free.

For layers, I'm thinking we may want to have a structure like:

CommonForecastLayer
SimpleForecastLayer(CommonForecastLayer)
DetailedForecastLayer(CommonForecastLayer)

Due to these different structure of forecasts and observations, I think we need totally different Enums for each.