read the subsets and the name of fields after reading FlatJsonRenderer().render(bufr_message)

denissanga commented 3 years ago

HI all I'm trying to extract some data from a bufr file but I don't understand how to read the field names and the different subsets to extract data as U-COMPONET, V-COMPONENT, PRESSURE etc after reading any bufr_message. I send in attachment an example file. thank you very much in advance L-000-MSG4-MPEF__-AMV__-000001_-202106171330-.zip

ywangd commented 3 years ago

Have you tried the query sub-command? You can read the docs here. Let me know if this solves your problem.

denissanga commented 3 years ago

thank you very much.

Until now I use the following code to start reading bufr:

from pybufrkit.decoder import Decoder from pybufrkit.renderer import FlatJsonRenderer from pybufrkit.mdquery import MetadataExprParser, MetadataQuerent from pybufrkit.dataquery import NodePathParser, DataQuerent from pybufrkit.decoder import generate_bufr_message SOME_BUFR_FILE = "L-000-MSG4__-MPEF________-AMV______-000001___-202106171330-__" decoder = Decoder() dataLength = [] with open(SOME_BUFR_FILE, 'rb') as ins: for bufr_message in generate_bufr_message(decoder, ins.read()): n_subsets = MetadataQuerent(MetadataExprParser()).query(bufr_message, '%n_subsets') query_result = DataQuerent(NodePathParser()).query(bufr_message, '001002') json_data = FlatJsonRenderer().render(bufr_message) dataLength.append([len(json_data), len(json_data[3][2]), [len(i) for i in json_data]]) pass # do something with the decoded message object

in json_data I obtain a list of all values for each message and I would like to have a list of all name of each variable and entract data from subsets. I try looking the link but I don't understand how to use it correctly

Thank you again Best regards

hautecoeur commented 3 years ago

This is a piece of code to explain how you can read and decode the main data. Olivier

import pandas as pd
from pybufrkit.decoder import Decoder
from pybufrkit.decoder import generate_bufr_message
from pybufrkit.renderer import FlatJsonRenderer

FILENAME = "L-000-MSG4__-MPEF________-AMV______-000001___-202106171330-__"
decoder = Decoder()

df = pd.DataFrame()

# this file is a multiple-message BUFR file
with open(FILENAME, "rb") as ins:
    for bufr_message in generate_bufr_message(decoder, ins.read()):
        json_data = FlatJsonRenderer().render(bufr_message)
        df = pd.concat([df, `pd.DataFrame(json_data[3][2])])`
        # df contains all the wind records as a matrix

# extracting (some of ) the most important fields
amv = pd.DataFrame({'latitude':df[17], 'longitude':df[18], 'pressure':df[27], 'direction':df[28], 'speed':df[29], 'u':df[30], 'v':df[31], 'channel':df[54], 'qix':df[170]})
print(amv) # there are 50346 wind records

# extracting the records for the SEVIRI channel 9 (IR 10.8)
seviri9 = amv.loc[(amv['channel']==9)]
print(seviri9) # only 11067 extracted from thermal infrared data

 # filtering the 'good' winds
goodwinds = seviri9.loc[(seviri9['speed']>2.5) & (seviri9['qix']>=80)]
print(goodwinds) # 6302 passed the selection criteria

denissanga commented 3 years ago

thank you really much for your help. It works

steph-ben commented 2 years ago

Looks really good and helpful !!! However, how did you get the column association, eg. latitude = df[17] ?

ywangd / pybufrkit

read the subsets and the name of fields after reading FlatJsonRenderer().render(bufr_message) #18