Unidata / MetPy

MetPy is a collection of tools in Python for reading, visualizing and performing calculations with weather data.
https://unidata.github.io/MetPy/
BSD 3-Clause "New" or "Revised" License
1.23k stars 413 forks source link

Upper Air Parsing #71

Open dopplershift opened 9 years ago

dopplershift commented 9 years ago

Add parsing of TTAA/TTBB messages. This will facilitate replacing the netcdf-perl decoders TDS uses for the (unhosted) upper air netcdf files.

akrherz commented 9 years ago

For what it is worth, I am not aware of a current python alternative.

dopplershift commented 8 years ago

Some useful links:

luciodaou commented 1 year ago

Would really appreciate this. Still no other parser?

dopplershift commented 1 year ago

Nothing's been added to MetPy, but we're always happy to have community contributions.

luciodaou commented 1 year ago

Nothing's been added to MetPy, but we're always happy to have community contributions.

I've been searching for this for the last 2 weeks. I couldn't get nothing to work.

My specific case is that I have acess to a JSON containing TTAA, TTBB, TTCC and TTDD messages of upper air soundings in in plain text, in distinct fields. So it's not the real BUFR file generated on site, it has some sort of processing to be distributed as JSON.

Anyway, I'll keep looking and report back if I find something useful.

I found this link regarding the parsing realized by GEMPAK: https://www.unidata.ucar.edu/support/help/MailArchives/gempak/msg03627.html

luciodaou commented 11 months ago

Hey, giving an update here.

I've been looking into this and still haven't found a way. Even tried using ChatGPT and Google's Bard to develop any kind of code, but without success. A friend of mine that works as a professional programmer managed to code something, but the possibility of invalid data and the many specific rules makes this very difficult.

My understanding is that this 5 character format is useful for Fortran programming, so I'm hoping there's some kind of "official code" somewhere. This week I'm researching GrADS source code, but I'm not very hopeful.

I'm also making contact with some people responsible for a few known meteorology websites providing upper air soundings.

DanielAdriaansen commented 8 months ago

Hey, giving an update here.

I've been looking into this and still haven't found a way. Even tried using ChatGPT and Google's Bard to develop any kind of code, but without success. A friend of mine that works as a professional programmer managed to code something, but the possibility of invalid data and the many specific rules makes this very difficult.

My understanding is that this 5 character format is useful for Fortran programming, so I'm hoping there's some kind of "official code" somewhere. This week I'm researching GrADS source code, but I'm not very hopeful.

I'm also making contact with some people responsible for a few known meteorology websites providing upper air soundings.

You may find this useful: https://github.com/Unidata/gempak/blob/main/gempak/source/bridge/ru/rudecd.f

This looks like the FORTRAN routine that is called by GEMPAK to decode TTAA/TTBB formatted upper-air data.

Here's some GEMPAK documentation: https://unidata.github.io/gempak/man/prog/dcuair.html

And here's the GEMPAK file where "RU_DECD" is called: https://github.com/Unidata/gempak/blob/main/gempak/source/programs/dc/dcuair/dcudcd.f

kgoebber commented 1 month ago

Hi All,

I had some working code that I had worked on a while ago (read 4 years ago), but picked it up again today after a question came to me through a different channel. Below is the code I have to read "real time" TTAA and TTBB data from the NWS to make a plot.

from datetime import datetime, timedelta
from io import StringIO
from bs4 import BeautifulSoup
import re
import urllib.request

from metpy.io import add_station_lat_lon
import numpy as np
import pandas as pd

now = datetime.utcnow()
year = now.year
month = now.month
day = now.day

# Set location and hour (hour only for display purposes)
location = 'SGF'

raw_man_data = f'https://forecast.weather.gov/product.php?site=NWS&product=MAN&issuedby={location}'
raw_sgl_data = f'https://forecast.weather.gov/product.php?site=NWS&product=SGL&issuedby={location}'

man_data = urllib.request.urlopen(raw_man_data).read().decode("utf8")
sgl_data = urllib.request.urlopen(raw_sgl_data).read().decode("utf8")
#fp.close()

#print(mystr)
man_soup = BeautifulSoup(man_data, 'html.parser')
sgl_soup = BeautifulSoup(sgl_data, 'html.parser')

full_data = (man_soup.find('pre').contents[0].replace('\n', ' ').replace('NNNN', '').strip()
             + sgl_soup.find('pre').contents[0].replace('\n', ' ').replace('NNNN', '').strip()).split('=')
#full_data

# Start parsing data
pressure = []
height = []
temperature = []
dewpoint = []
wdir = []
speed = []
date_time = []
stnum = []
stid = []

df2 = pd.DataFrame(columns = ['pressure', 'temperature', 'dewpoint', 'height',
                              'wind_speed', 'wind_direction', 'station', 'station_number',
                              'date_time'])

hour = int(full_data[0].split()[3][2:4])
date = datetime(now.year, now.month, now.day, hour)
print(f'Processing {date}')

for obs in full_data:
    data = obs.split()

    if len(data) > 1:
        # Parse TTAA
        if (data[6] == 'TTAA'):
            idx = data.index('TTAA')
            station_name = data[2]
            station_number = data[idx-1]
            vtime = datetime(year, month, int(data[idx+1][:2]) - 50, int(data[idx+1][2:4]))

            for part in range(idx+3, len(data), 3):
                plev = int(data[part][:2])

                if plev == 31 or plev == 51 or plev == 88:
                    break

                if (plev != 88) and (plev != 99) and (plev != 31) and (plev != 51):
                    # Add station id, num, and date_time for each level
                    stid.append(station_name)
                    stnum.append(station_number)
                    date_time.append(vtime)

                    # Process Pressure
                    p = int(data[part][:2])
                    if p == 0:
                        p = 1000
                    elif p == 92:
                        p = 925
                    else:
                        p *= 10

                    pressure.append(p)

                    # Process Height
                    z = data[part][2:]
                    if z != '///':
                        z = int(z)
                    else:
                        z = np.nan
                    if p == 850:
                        z += 1000
                    elif p == 700:
                        if z > 500:
                            z += 2000
                        else:
                            z += 3000
                    elif p in [500, 400, 300]:
                        z *= 10
                    elif p < 300:
                        z *= 10
                        z += 10000

                    height.append(z)

                    # Process Temperature and Dewpoint
                    T = data[part+1][:3]
                    if T != '///':
                        T = int(T)
                        if (T % 2) != 0:
                            minus_sign = -1
                        else:
                            minus_sign = 1
                        T = minus_sign * T / 10
                    else:
                        T = np.nan

                    temperature.append(T)

                    DD = data[part+1][3:]
                    if DD != '//':
                        DD = int(DD)
                        if DD < 50:
                            DD = DD/10
                        else:
                            DD -= 50
                    else:
                        DD = np.nan
                    Td = T - DD

                    dewpoint.append(Td)

                    # Process Wind Direction and Speed
                    WD = data[part+2][:3]

                    if WD != '///':
                        last_digit = int(WD[-1])
                        if last_digit in [1, 2, 3, 4]:
                            WD = WD[:2]+'0'
                            WS_offset = last_digit
                        elif last_digit in [6, 7, 8, 9]:
                            WD = WD[:2]+'5'
                            WS_offset = last_digit - 5
                        else:
                            WS_offset = 0
                        WD = int(WD)
                    else:
                        WD = np.nan
                    WS = data[part+2][3:]
                    if WS != '//':
                        WS = int(WS) + WS_offset * 100
                    else:
                        WS = np.nan

                    wdir.append(WD)
                    speed.append(WS)

                elif plev == 99:
                    p = data[part][2:]
                    if p != '///':
                        p = int(p)
                    else:
                        p = np.nan

        # Parse TTBB Sig Temp/DD
        if (data[6] == 'TTBB'):

            idx = data.index('TTBB')
            for part in range(idx+3, len(data), 2):
                #print(sig)
                sig = data[part]
                if (sig == '21212') or (sig == '31313') or (sig == '51515'):
                    break
                stid.append(station_name)
                stnum.append(station_number)
                date_time.append(vtime)

                # No height data
                height.append(np.nan)

                # Process Pressure
                pp = data[part][-3:]
                if sig[:2] == '00' and pp[0] == '0':
                    p = 1000 + int(pp)
                else:
                    p = int(pp)
                pressure.append(p)

                # Process Temperature and Dewpoint
                T = data[part+1][:3]
                if T != '///':
                    T = int(T)
                    if (T % 2) != 0:
                        minus_sign = -1
                    else:
                        minus_sign = 1
                    T = minus_sign * T / 10
                else:
                    T = np.nan

                temperature.append(T)

                # Modified on 7/24/24 to correctly account for DD
                DD = data[part+1][3:]
                DD = int(DD)
                if DD != '//':
                    if DD < 50:
                        DD = DD/10
                    else:
                        DD -= 50
                else:
                    DD = np.nan
                Td = T - DD

                dewpoint.append(Td)

                wdir.append(np.nan)
                speed.append(np.nan)

data_dict = {'pressure': pressure, 'temperature': temperature, 'dewpoint': dewpoint, 'height': height,
             'wind_speed': speed, 'wind_direction': wdir, 'station': stid, 'station_number': stnum,
             'date_time': date_time}

df = pd.DataFrame(data_dict).drop_duplicates(subset=['pressure', 'temperature', 'dewpoint'])

# Added on 7/24/24 to remove pressure levels with missing temp or dew point data
df = df.dropna(subset=['pressure', 'temperature', 'dewpoint'])

# Parse Significant winds reported in TTBB
pressure = []
wdir = []
speed = []
date_time = []
stnum = []
stid = []

for obs in full_data:
        data = obs.split()
        if len(data) > 1:
            if (data[6] == 'TTBB'):

                idx = data.index('21212')
                for part in range(idx+1, len(data), 2):
                    sig = data[part]

                    if (sig == '31313'):
                        break
                    stid.append(station_name)
                    stnum.append(station_number)
                    date_time.append(vtime)

                    # Process Pressure
                    pp = data[part][-3:]
                    if sig[:2] == '00' and pp[0] == '0':
                        p = 1000 + int(pp)
                    else:
                        p = int(pp)
                    pressure.append(p)

                    # Process Wind Direction and Speed
                    WD = data[part+1][:3]

                    if WD != '///':
                        last_digit = int(WD[-1])
                        if last_digit in [1, 2, 3, 4]:
                            WD = WD[:2]+'0'
                            WS_offset = last_digit
                        elif last_digit in [6, 7, 8, 9]:
                            WD = WD[:2]+'5'
                            WS_offset = last_digit - 5
                        else:
                            WS_offset = 0
                        WD = int(WD)
                    else:
                        WD = np.nan
                    WS = data[part+1][3:]
                    if WS != '//':
                        WS = int(WS) + WS_offset * 100
                    else:
                        WS = np.nan

                    speed.append(WS)
                    wdir.append(WD)

                    df.loc[df.pressure == p, 'wind_speed'] = WS
                    df.loc[df.pressure == p, 'wind_direction'] = WD

sig_winds_data_dict = {'pressure': pressure, 'wind_speed': speed, 'wind_direction': wdir,
                       'station': stid, 'station_number': stnum, 'date_time': date_time}
df_sig_winds = pd.DataFrame(sig_winds_data_dict)

df.sort_values(['station_number', 'pressure'], ascending=[True, False], inplace=True,
               ignore_index=True)

# Save data files if desired
#print(f'Writing file for {date:%Y-%m-%d}')
#df.to_csv(f'{date:%Y%m%d%H}_{location}_upa_temp_data.csv', index=False)
#df_sig_winds.to_csv(f'{date:%Y%m%d%H}_{location}_upa_sig_wind_data.csv', index=False)

# Make a simple plot
import matplotlib.pyplot as plt
from matplotlib.ticker import (MultipleLocator, NullFormatter,
                               ScalarFormatter)
from metpy.plots import SkewT
from metpy.units import units
from metpy.calc import wind_components
import numpy as np

p = df.pressure.values * units.hPa
T = df.temperature.values * units.degC
Td = df.dewpoint.values * units.degC
uwnd, vwnd = wind_components(df.wind_speed.values * units.knots,
                             df.wind_direction.values * units.degree)

sig_uwnd, sig_vwnd = wind_components(df_sig_winds.wind_speed.values * units.knots,
                                    df_sig_winds.wind_direction.values * units.degree)

fig = plt.figure(figsize=(11, 8))
skew = SkewT(fig, rotation=45)

skew.plot_dry_adiabats(t0=np.arange(-30, 300, 10)*units.degC, colors='tab:red', alpha=0.5)
skew.plot_moist_adiabats(t0=np.arange(-30, 100, 5)*units.degC, colors='tab:green', alpha=0.5)
skew.plot_mixing_lines(mixing_ratio=np.arange(1, 60, 2)*units('g/kg'),
                       pressure=np.arange(1000, 99, -1)*units.hPa,
                       linestyle='dotted', colors='tab:blue', alpha=0.5)

skew.ax.set_ylim(1000, 100)
skew.ax.set_xlim(-30, 40)

skew.plot(p, T, color='red')
skew.plot(p, Td, color='green')

skew.plot_barbs(df_sig_winds.pressure[::2], sig_uwnd[::2], sig_vwnd[::2])

plt.title(f'{station_name}', loc='left')
plt.title(f'{date:%Y-%m-%d %H} UTC', loc='right')

plt.show()

Run on 4 July 2024 with 12 UTC data available for station SGF yields: download

luciodaou commented 1 month ago

Hi All,

I had some working code that I had worked on a while ago (read 4 years ago), but picked it up again today after a question came to me through a different channel. Below is the code I have to read "real time" TTAA and TTBB data from the NWS to make a plot.

Thank you so much. I have no words to describe my joy of seeing it work!

Made it work with separate TTAA and TTBB messages from Brazilian aeronautical weather services on a quick test.

kgoebber commented 4 weeks ago

@luciodaou I made a few minor changes to the provided code above. There was a slight parsing issue with dew point depression for TTBB code (wasn't taking into account the greater than 50 values appropriately. Additionally, I added a line to drop NA values from the pressure/temperature/dewpoint data (which would most likely take out values that are below ground and not reported!).

luciodaou commented 4 weeks ago

@luciodaou I made a few minor changes to the provided code above. There was a slight parsing issue with dew point depression for TTBB code (wasn't taking into account the greater than 50 values appropriately. Additionally, I added a line to drop NA values from the pressure/temperature/dewpoint data (which would most likely take out values that are below ground and not reported!).

Thanks!

VS Code is showing me indentation errors: image

Previously: image

Managed to eliminate errors with the same logic:

                # Modified on 7/24/24 to correctly account for DD
                DD = data[part+1][3:]
                if DD != '//':
                    DD = int(DD)
                    if DD < 50:
                        DD = DD/10
                    else:
                        DD -= 50
                else:
                    DD = np.nan
                Td = T - DD
kgoebber commented 4 weeks ago

@luciodaou Thanks! That is what I get for editing the example code directly in the comment and not doing a fully copy and paste from the working code in a notebook! I've updated the above code with the (hopefully) working if-statement!