ecmwf / polytope

A library for extracting polytope "features" from datacubes
https://www.ecmwf.int
Apache License 2.0
26 stars 4 forks source link

No module named 'pygribjump' #133

Open jmb64 opened 3 months ago

jmb64 commented 3 months ago

What happened?

Hi,

  This issue is more of a question than a bug report. 

 A couple of month ago someone you probably know (Dr. Quintino) gave a video conference where I learned about "polytope", earthkit, ... etc.  A couple of slides showed performance results (timings of feature extraction using polytope and GribJump).  It prompted me to install the package with conda and play with a few examples. One of them was 3D_shipping_route_wave_model.py  making use of the FDB backend ...

python 3D_shipping_route_wave_model.py

Traceback (most recent call last): File ".../3D_shipping_route_wave_model.py", line 8, in from polytope.datacube.backends.fdb import FDBDatacube File .../python3.10/site-packages/polytope/datacube/backends/fdb.py", line 4, in import pygribjump as pygj ModuleNotFoundError: No module named 'pygribjump'

I searched for pygribjump ... it was nowhere to be found.

I realize that the software is a work in progress. I just want to check with you that the module (GribJump, pygribjump) is indeed not publicly available and the outcome is as expected ?

I look forward to see more performance results with earthkit/polytope. Your work is interesting and appears to be really tigthly integrated to the software infrastructure at ecmwf. This is new stuff in comparison to what is done in the context of PANGEO for example.

Anyway, thanks for your time,

Oh, and I did succeed in running some other tests though ... :-)

What are the steps to reproduce the bug?

none

Version

polytope 1.03

Platform (OS and architecture)

cat /etc/os-release --> Ubuntu 22.04.4 LTS uname -a ---> Linux 4.18.0-240.el8.x86_64

Relevant log output

none

Accompanying data

none

Organisation

Environment and Climate Change Canada

jameshawkes commented 3 months ago

Hi @jmb64,

You are quite right, we had not made gribjump public yet, but now we have :) https://github.com/ecmwf/gribjump

It is very experimental and we are actively developing this whole stack. To use gribjump you'll need to build it, along with FDB (https://github.com/ecmwf/fdb) and it's dependencies. pygribjump needs to be pip installed directly from the git repository too, as its not on pypi yet.

To be honest the polytope-gribjump-fdb stack is not really in a state where we'd expect people to be able to deploy it succesfully, but you're welcome to try!

jmb64 commented 3 months ago

Hi James,

Thanks for your reply.  I won't loose any sleep over it but I will probably try ... :-)

It seems that access to the python api pygribjump is rectricted though. Here's what I get:


python3 -m pip install --upgrade git+ssh://git@github.com/ecmwf/gribjump.git@master

Collecting git+ssh://@github.com/ecmwf/gribjump.git@master Cloning ssh://@github.com/ecmwf/gribjump.git (to revision master) to /tmp/jmb001/203/pip-req-build-csew7wed Running command git clone --filter=blob:none --quiet 'ssh://****@github.com/ecmwf/gribjump.git' /tmp/jmb001/203/pip-req-build-csew7wed Warning: Permanently added 'github.com' (ED25519) to the list of known hosts. git@github.com: Permission denied (publickey). fatal: Could not read from remote repository.

Please make sure you have the correct access rights and the repository exists.


jameshawkes commented 3 months ago

Hi @jmb64, try pip installing it using https instead of ssh:

pip install --upgrade git+https://github.com/ecmwf/gribjump

jmb64 commented 3 months ago

Right on. Thanks!

jmb64 commented 3 months ago

Would you happen to have an example showing the content for this file ?

GRIBJUMP_CONFIG_FILE = "/path/to/gribjump/config.yaml"

ChrisspyB commented 2 months ago

Hi @jmb64 , The config for GribJump is very minimal (for now), and in principle should be optional. For a local extraction on your machine, the following config.yaml should do the trick:

---
type: local
jmb64 commented 2 months ago

Hi,

Ok, I probably got as far as i could without having an actual instance of FDB installed locally.  I thought I could maybe get away with plain grib file(s) on GPFS to mimick FDB (with indexes on the side ?). By the way how do you let your python scripts know  the location of your local FDB ?    As a suggestion, I don't know how complex this would be but  ... a simple "example FDB" setup would be useful for running tests outside the walls of ECMWF I guess. Apart from that I think my software installation appears to be ok. A few scripts are not completing successfully (polytope requests definition) but I suspect it is because they are outdated w/r to the api. This is on me. It is a work in progress. I was warned and that's fine :-)    

I will follow your development and docs .

Thanks.
Jean-Marc

jmb64 commented 2 months ago

Hi Chris,

 Just to let you know that I managed to get things a little bit more "concrete" for myself with this FDB ... using 

readthedocs.io (fdb and earthkit-data). To answer my own question about an FDB example for testing ... i was able to create the POSIX filesystem flavor of FDB from grib through earthkit. I also noticed the grib2fdb cli tool in my build of the whole shebang ... The hard part is not knowing much about the ECMWF culture (MARS language ...)
One last thing, my fdb build was incomplete (cli tools and tests/api...) and I solved it by adding libcurl (-lcurl) to a bunch of link.txt files. I don't know if it was something I missed in the configuration (ecbuild) or if it should be corrected in fdb github ...

jmb64 commented 2 months ago

Maybe one last question ...

Running this test https://github.com/ecmwf/polytope/blob/develop/tests/test_fdb_datacube.py

it seems I cannot get "step" to make it as an FDB datacube axes.

This is the bit of code which I believe should create the datacube instance:


self.options = { "config": [ {"axis_name": "step", "transformations": [{"name": "type_change", "type": "int"}]}, {"axis_name": "number", "transformations": [{"name": "type_change", "type": "int"}]}, { "axis_name": "date", "transformations": [{"name": "merge", "other_axis": "time", "linkers": ["T", "00"]}], }, { "axis_name": "values", "transformations": [ {"name": "mapper", "type": "octahedral", "resolution": 1280, "axes": ["latitude", "longitude"]} ], }, {"axis_name": "latitude", "transformations": [{"name": "reverse", "is_reverse": True}]}, {"axis_name": "longitude", "transformations": [{"name": "cyclic", "range": [0, 360]}]}, ] } self.config = {"class": "od", "expver": "0001", "levtype": "sfc", "stream": "oper"} self.fdbdatacube = FDBDatacube(self.config, axis_options=self.options)

Traceback (most recent call last): File "...ECMWF/EARTHKIT_POLYTOPE/test_fdb_datacube.py", line 102, in a.test_fdb_datacube() File "... ECMWF/EARTHKIT_POLYTOPE/test_fdb_datacube.py", line 78, in test_fdb_datacube result = self.API.retrieve(request) File ".../.conda/envs/ecmwf/lib/python3.10/site-packages/polytope/polytope.py", line 58, in retrieve request_tree = self.engine.extract(self.datacube, request.polytopes()) File ".../.conda/envs/ecmwf/lib/python3.10/site-packages/polytope/engine/hullslicer.py", line 140, in extract self._unique_continuous_points(p, datacube) File ".../.conda/envs/ecmwf/lib/python3.10/site-packages/polytope/engine/hullslicer.py", line 27, in _unique_continuous_points mapper = datacube.get_mapper(ax) File ".../.conda/envs/ecmwf/lib/python3.10/site-packages/polytope/datacube/backends/datacube.py", line 128, in get_mapper return self._axes[axis] KeyError: 'step'

I get this KeyError during the extraction ... which I figure is because "step" has not been added to axes ?

I added a few extra print ... in the

JMB: datacube.py create_axes_config: {'config': [{'axis_name': 'step', 'transformations': [{'name': 'type_change', 'type': 'int'}]}, {'axis_name': 'number', 'transformations': [{'name': 'type_change', 'type': 'int'}]}, {'axis_name': 'date', 'transformations': [{'name': 'merge', 'other_axis': 'time', 'linkers': ['T', '00']}]}, {'axis_name': 'values', 'transformations': [{'name': 'mapper', 'type': 'octahedral', 'resolution': 1280, 'axes': ['latitude', 'longitude']}]}, {'axis_name': 'latitude', 'transformations': [{'name': 'reverse', 'is_reverse': True}]}, {'axis_name': 'longitude', 'transformations': [{'name': 'cyclic', 'range': [0, 360]}]}]} [{'axis_name': 'step', 'transformations': [{'name': 'type_change', 'type': 'int'}]}, {'axis_name': 'number', 'transformations': [{'name': 'type_change', 'type': 'int'}]}, {'axis_name': 'date', 'transformations': [{'name': 'merge', 'other_axis': 'time', 'linkers': ['T', '00']}]}, {'axis_name': 'values', 'transformations': [{'name': 'mapper', 'type': 'octahedral', 'resolution': 1280, 'axes': ['latitude', 'longitude']}]}, {'axis_name': 'latitude', 'transformations': [{'name': 'reverse', 'is_reverse': True}]}, {'axis_name': 'longitude', 'transformations': [{'name': 'cyclic', 'range': [0, 360]}]}]

JMB: fdb.py self.fdb_coordinates.items() dict_items([('values', [])])

JMB: fdb.py name_values values []

JMB: opt axis_name='values' transformations=[MapperConfig(name='mapper', type='octahedral', resolution=1280, axes=['latitude', 'longitude'], local=None)]

JMB: self._axes {'latitude': <polytope.datacube.datacube_axis.FloatDatacubeAxis object at 0x1517b7e39c30>, 'longitude': <polytope.datacube.datacube_axis.FloatDatacubeAxis object at 0x1517b7e39cc0>}

JMB get_mapper self._axes= {'latitude': <polytope.datacube.datacube_axis.FloatDatacubeAxis object at 0x1517b7e39c30>, 'longitude': <polytope.datacube.datacube_axis.FloatDatacubeAxis object at 0x1517b7e39cc0>}

I believe latitude and longitude have been added to the datacube axes from these prints ...

Thanks.

JM

jmb64 commented 2 months ago

Logging from polytope for python test_fdb_datacube.py:

cat lis_test_fdb_datacube.log

INFO:root:Created an FDB datacube with options: {'config': [{'axis_name': 'step', 'transformations': [{'name': 'type_change', 'type': 'int'}]}, {'axis_name': 'number', 'transformations': [{'name': 'type_change', 'type': 'int'}]}, {'axis_name': 'date', 'transformations': [{'name': 'merge', 'other_axis': 'time', 'linkers': ['T', '00']}]}, {'axis_name': 'values', 'transformations': [{'name': 'mapper', 'type': 'octahedral', 'resolution': 1280, 'axes': ['latitude', 'longitude']}]}, {'axis_name': 'latitude', 'transformations': [{'name': 'reverse', 'is_reverse': True}]}, {'axis_name': 'longitude', 'transformations': [{'name': 'cyclic', 'range': [0, 360]}]}]}

DEBUG:root:Skipping /etc/polytope/config.json, file not found. DEBUG:root:Skipping /etc/polytope/config.yaml, file not found. DEBUG:root:Skipping /home/jmb001/.polytope.json, file not found. DEBUG:root:Skipping /home/jmb001/.polytope.yaml, file not found. DEBUG:root:Conflated config: {}

INFO:root:Axes returned from GribJump are: {}

INFO:root:Polytope created axes for: dict_keys(['latitude', 'longitude'])

mathleur commented 2 months ago

Hi @jmb64,

It seems like maybe there isn't the right data for this particular example written on the FDB that Polytope is connecting to?

Unfortunately, I don't think the data that goes with these particular tests can be viewed publicly. However, I think it should be possible to adapt the Polytope options and datacube configuration to work with the data written on your local FDB.

The datacube axes created in the Polytope API aren't really specified by the options, but are built by "reading" the axes from the data on the FDB. The options only really help create an easy user interface to make requesting data more intuitive for users, such as being able to request data at longitude=420 (instead of having to request data at longitude=60) for example.

The options here refer to particular "transformations" on the data, such as cyclic axes or special grids, like octahedral grids for example (when the data grib file doesn't store the data along latitude/longitude, but rather along a combined "index" axis). There is also the type change transformation, which we use when the request type (step in integers for example) doesn't match how the data is stored on the FDB (in this case as strings), as well as the reverse transformation, when the data stored in the FDB on an axis is in decreasing order (for example latitude indexes are stored from 90 to -90 instead of -90 to 90). The datacube configuration refers to the first-layer path pointing to the data stored on the local FDB.

I also think that currently, the Polytope version on develop doesn't yet work with the GribJump master branch. Instead, I believe the feature/gribjump_axis_branching branch of Polytope should work with the latest GribJump master branch.

I hope this helps, but I would be happy to answer more questions!

jmb64 commented 2 months ago

Hi Mathilde,

       I appreciate you taking the time to offer these explanations. I kinda realize friday that actual data (FDB) was playing a 

major role in the datacube configuration :-) ... just by changing the FDB I noticed I was getting past the configuration (i.e. step axes ...) issue . I tried retrieving a GRIB file output from the HRES WAM with the 2d spectrum (2dfd) ... but it is not available it seems (damn! :-)) ... as you mentioned. I will see if I can rewrite an example akin to 3D_shipping_route_wave_model but with available data this time ! I will re-read your post carefully, i'm sure this will help, and might have some questions later ...

Regards,

Jean-Marc

PS: You probably know the answer to this one ... In Tiago's presentation (Bytes from Petabytes, slide 20) there's a graph showing linear scaling for the extraction process for 1000 to 4500 values on Lustre ... we're talking about double precision values right ? Hence the 36kB(4500 values)/730ms ---> 4-6 ms/value ... Are those results from /examples or /performance on github ?

jmb64 commented 2 months ago

Hello again,

 I decided to adapt the polytope/FDB example "3D_shipping_route_wave_model.py", creating an FDB from era5 specific humidity in the marine boundary layer since I don't have access to the wave modele data (let,s say one is interested in shipborne lidar data , .. i.e. HALO/Wales)  using this CDS request:

1)

!/usr/bin/env python

import cdsapi

c = cdsapi.Client()

c.retrieve("reanalysis-era5-complete", { "class": "ea", "date": "2021-07-01/to/2021-07-31", "expver": "1", "levelist": "100/101/102/103/104/105/106/107/108/109/110/111/112/113/114/115/116/117/118/119/120/121/122/123/124/125/126/127/128/129/130/131/132/133/134/135/136/137", "levtype": "ml", "param": "133", "step": "0/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18", "stream": "oper", "time": "18:00:00", "type": "fc", "grid": "0.25/0.25" }, "era5_202107_regular_ll_2.grib")

then create the local FDB

2) fdb write era5_202107_regular_ll_2.grib. I keep the same shapefile for shipping route (Shipping-Lanes-v1.shp)

3) Ruuning the modified script for specific humidity the extraction process encounters a problem in GribJump:

GRIBJUMP_DEBUG Using FDB's (new) axes impl GRIBJUMP_DEBUG Base request: retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=1,stream=oper,type=fc GRIBJUMP_DEBUG Flattened request: retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=1,stream=oper,type=fc GRIBJUMP_DEBUG Flattened keys for request retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=1,stream=oper,type=fc: [class=ea,date=20210702,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=1,stream=oper,time=1800,type=fc] GRIBJUMP_DEBUG Base request: retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=2,stream=oper,type=fc GRIBJUMP_DEBUG Flattened request: retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=2,stream=oper,type=fc GRIBJUMP_DEBUG Flattened keys for request retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=2,stream=oper,type=fc: [class=ea,date=20210702,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=2,stream=oper,time=1800,type=fc] GRIBJUMP_DEBUG Base request: retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=3,stream=oper,type=fc GRIBJUMP_DEBUG Flattened request: retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=3,stream=oper,type=fc GRIBJUMP_DEBUG Flattened keys for request retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=3,stream=oper,type=fc: [class=ea,date=20210702,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=3,stream=oper,time=1800,type=fc] GRIBJUMP_DEBUG Built flat keys GRIBJUMP_DEBUG Built keyToExtractionItem GRIBJUMP_DEBUG Union request retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=1/2/3,stream=oper,type=fc GRIBJUMP_DEBUG Found 3 fields in 1 files GRIBJUMP_DEBUG File map: GRIBJUMP_DEBUG file=.../ECMWF/data/regular_ll/ea:0001:oper:20210702:1800:g/fc:ml.20240605.121554.hpcr5-vis5.6636793919176709.data, Offsets=[155833500, 234789140, 313744780, ] GRIBJUMP_DEBUG Thread 22437929043520 starting GRIBJUMP_DEBUG Thread 22437716297280 starting GRIBJUMP_DEBUG Queued task 0 Queued task 0GRIBJUMP_DEBUG Waiting for 1 task... GRIBJUMP_DEBUG task... GRIBJUMP_DEBUG 2437716297280 new job GRIBJUMP_DEBUG Cache directory GRIBJUMP_DEBUG Warning, cache persistence is disabled GRIBJUMP_DEBUG Cache file /fc:ml.20240605.121554.hpcr5-vis5.6636793919176709.data.gribjump does not exist GRIBJUMP_DEBUG Thread 22437716297280 exception: Read access outside data section: 2984368 + 32 > 2076480 GRIBJUMP_DEBUG All tasks complete GRIBJUMP_DEBUGGRIBJUMP_DEBUG Thread Thread 22437716297280 stopping (queue closed) data section: 2984368 +Thread Thread 22437716297280 stopping (queue closed) data section: 2984368 +

The offsets for step 1,2,3 seem right judging by fdb dump-index on the FDB.

4) I'm using polytope (develop branch) and GribJump (develop branch, either may17 commit or latest june5 give the same result).

Debugging points to : GribJumpDataAccessor.h (line 35)

throw std::runtime_error("Read access outside data section: " + std::to_string(offset) + " + " + std::to_string(size) + " > " + std::to_string(data_section_size))

Does this look familiar to you ? Again, I realize this is work in progress. Ans I'm not implying this is a bug since I barely know what I'm doing when creating the polytope request ! :-)

5) Full python script adaptation if ever you would consider running similar case.

cat 3D_shipping_route_era5_mabl_specific_humidity_jmb.py

import geopandas as gpd import matplotlib.pyplot as plt import pandas as pd import pytest from eccodes import codes_grib_find_nearest, codes_grib_new_from_file

from polytope.datacube.backends.fdb import FDBDatacube from polytope.engine.hullslicer import HullSlicer from polytope.polytope import Polytope, Request from polytope.shapes import Ellipsoid, Path, Select from tests.helper_functions import download_test_data import logging

class TestReducedLatLonGrid: def setup_method(self, method):

JMB

    logger = logging.getLogger(__name__)
    logging.basicConfig(filename='lis_3D_shipping_route_era5_MABL_specific_humidity.log', encoding='utf-8', level=logging.DEBUG)

    self.options = {
        "config": [
            {"axis_name": "step", "transformations": [{"name": "type_change", "type": "int"}]},
            {"axis_name": "number", "transformations": [{"name": "type_change", "type": "int"}]},
            {
                "axis_name": "date",
                "transformations": [{"name": "merge", "other_axis": "time", "linkers": ["T", "00"]}],
            },
            {
                "axis_name": "values",
                "transformations": [
                    {"name": "mapper", "type": "reduced_ll", "resolution": 1441, "axes": ["latitude", "longitude"]}
                ],
            },

JMB {"axis_name": "latitude", "transformations": [{"name": "reverse", "is_reverse": True}]},

            {"axis_name": "longitude", "transformations": [{"name": "cyclic", "range": [0, 360]}]},
        ]
    }

ea:0001:oper:20210701:1800:g

self.config = {"class": "od", "stream": "wave"}

    self.config = {"class": "ea", "stream" : "oper" }

    self.fdbdatacube = FDBDatacube(self.config, axis_options=self.options)
    self.slicer = HullSlicer()
    self.API = Polytope(datacube=self.fdbdatacube, engine=self.slicer, axis_options=self.options)

def find_nearest_latlon(self, grib_file, target_lat, target_lon):
    messages = grib_file

    # Find the nearest grid points
    nearest_points = []
    for message in [messages[0]]:
        nearest_index = codes_grib_find_nearest(message, target_lat, target_lon)
        nearest_points.append(nearest_index)

    return nearest_points

@pytest.mark.internet
@pytest.mark.skip(reason="can't install fdb branch on CI")
def test_reduced_ll_grid(self):
    shapefile = gpd.read_file(".../ECMWF/EARTHKIT_POLYTOPE/examples/data/Shipping-Lanes-v1.shp")
    geometry_multiline = shapefile.iloc[2]
    geometry_object = geometry_multiline["geometry"]

    lines = []
    i = 0

    #JMB

print("geometry_object=",geometry_object)

print("len(geometry_object,geoms)=",len(geometry_object.geoms))

JMB for line in geometry_object[:7]:

    for line in list(geometry_object.geoms)[:7]:
        for point in line.coords:
            point_list = list(point)
            if list(point)[0] < 0:
                point_list[0] = list(point)[0] + 360
            lines.append(point_list)

    # Append for each point a corresponding step

    new_points = []
    for point in lines[:7]:
        new_points.append([point[1], point[0], 1])

    print("new_points=",new_points)        

    # Pad the shipping route with an initial shape

    padded_point_upper = [0.24, 0.24, 1]
    padded_point_lower = [-0.24, -0.24, 1]

padded_point_upper = [5.0, 5.0, 1]

padded_point_lower = [-5.0, -5.0, 0]

    initial_shape = Ellipsoid(["latitude", "longitude", "step"], padded_point_lower, padded_point_upper)

    # Then somehow make this list of points into just a sequence of points

    ship_route_polytope = Path(["latitude", "longitude", "step"], initial_shape, *new_points)
    print("JMB ship_route_polytope=",ship_route_polytope)

    request = Request(
        ship_route_polytope,
        Select("date", [pd.Timestamp("20210702T180000")]),
        Select("domain", ["g"]),
        Select("expver", ["0001"]),
        Select("param", ["133"]),
        Select("levelist", ["137"]),                  
        Select("class", ["ea"]),            
        Select("stream", ["oper"]),
        Select("levtype", ["ml"]),
        Select("type", ["fc"]),
    )
    print("JMB request=",request)

    result = self.API.retrieve(request)
    result.pprint()

    print("JMB: after request result=",result) 
    lats = []
    lons = []
    eccodes_lats = []
    eccodes_lons = []
    tol = 1e-8

JMB f = open("./tests/data/wave.grib", "rb")

    f = open(".../ECMWF/data/regular_ll/202107/era5_202107_regular_ll.grib", "rb")
    messages = []
    print("JMB before codes_grib_new_from_file ",flush=True)
    message = codes_grib_new_from_file(f)
    print("JMB after codes_grib_new_from_file message=",message,flush=True)        
    messages.append(message)

    leaves = result.leaves
    for i in range(len(leaves)):
        #JMB
        print("JMB i, leaves=",i,leaves)
        cubepath = leaves[i].flatten()
        lat = cubepath["latitude"]
        lon = cubepath["longitude"]
        del cubepath
        lats.append(lat)
        lons.append(lon)
        nearest_points = codes_grib_find_nearest(message, lat, lon)[0]
        eccodes_lat = nearest_points.lat
        eccodes_lon = nearest_points.lon
        eccodes_lats.append(eccodes_lat)
        eccodes_lons.append(eccodes_lon)
        assert eccodes_lat - tol <= lat
        assert lat <= eccodes_lat + tol
        assert eccodes_lon - tol <= lon
        assert lon <= eccodes_lon + tol
        print(i)
    f.close()

    worldmap = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
    fig, ax = plt.subplots(figsize=(12, 6))
    worldmap.plot(color="darkgrey", ax=ax)

    plt.scatter(lons, lats, s=18, c="red", cmap="YlOrRd")
    plt.scatter(eccodes_lons, eccodes_lats, s=6, c="green")
    plt.colorbar(label="Temperature")
    plt.show()

a=TestReducedLatLonGrid() a.setup_method(None) a.test_reduced_ll_grid()


Cheers,

Jean-Marc

mathleur commented 2 months ago

Hello @jmb64,

I think maybe this fails because of the grid option given to Polytope for the data. I think the grid keyword in the MARS requests transform the data automatically to a regular lat/lon grid and so the regular grid "transformation" would be needed in this case. I believe here the options should thus be

    self.options = {
    "config": [
        {"axis_name": "step", "transformations": [{"name": "type_change", "type": "int"}]},
        {"axis_name": "number", "transformations": [{"name": "type_change", "type": "int"}]},
        {
            "axis_name": "date",
            "transformations": [{"name": "merge", "other_axis": "time", "linkers": ["T", "00"]}],
        },
        {
            "axis_name": "values",
            "transformations": [
                {"name": "mapper", "type": "regular", "resolution": 360, "axes": ["latitude", "longitude"]}
            ],
        },
       {"axis_name": "latitude", "transformations": [{"name": "reverse", "is_reverse": True}]},
       {"axis_name": "longitude", "transformations": [{"name": "cyclic", "range": [0, 360]}]},
   ]
   }

I hope this helps!

jmb64 commented 2 months ago

Hi Mathilde,

  Thanks for the help. I did mentionned I was not saying there was a bug ! :-)

End of listing with your suggestion:


JMB fdb.py assign_fdb_output_to_nodes: sorted_range_lengths= [2, 2, 2, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 2, 2, 2, 2, 1, 1, 1, 2, 2, 2, 2, 2, 1, 2, 1, 2, 2, 2, 2, 2, 1, 2, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 2, 2, 3] JMB fdb.py assign_fdb_output_to_nodes: len(sorted_fdb_range_nodes) 49 JMB fdb.py assign_fdb_output_to_nodes: len(sorted_range_lengths)= 49 JMB fdb.py assign_fdb_output_to_nodes: request_output_values= [[(array([0.01897763, 0.01898812]), array([3], dtype=uint64)), (array([0.0189309 , 0.01894377]), array([3], dtype=uint64)), (array([0.01861666, 0.01857756]), array([3], dtype=uint64)), (array([0.01925228, 0.01907585]), array([3], dtype=uint64)), (array([0.01757191]), array([1], dtype=uint64)), (array([0.01921032, 0.01924036]), array([3], dtype=uint64)), (array([0.01765965]), array([1], dtype=uint64)), (array([0.01865195, 0.01875971]), array([3], dtype=uint64)), (array([0.01804016]), array([1], dtype=uint64)), (array([0.01867436, 0.01855133]), array([3], dtype=uint64)), (array([0.01802109]), array([1], dtype=uint64)), (array([0.01856135, 0.01851414]), array([3], dtype=uint64)), (array([0.0183811]), array([1], dtype=uint64)), (array([0.01856325, 0.0186014 ]), array([3], dtype=uint64)), (array([0.01868962, 0.01883791]), array([3], dtype=uint64)), (array([0.019114 , 0.01894663]), array([3], dtype=uint64)), (array([0.01896332, 0.01910256]), array([3], dtype=uint64)), (array([0.01925371, 0.01914023]), array([3], dtype=uint64)), (array([0.01899813]), array([1], dtype=uint64)), (array([0.01924895]), array([1], dtype=uint64)), (array([0.0188484]), array([1], dtype=uint64)), (array([0.01918934, 0.01911066]), array([3], dtype=uint64)), (array([0.01867293, 0.01856039]), array([3], dtype=uint64)), (array([0.01952313, 0.01928614]), array([3], dtype=uint64)), (array([0.01834391, 0.01835249]), array([3], dtype=uint64)), (array([0.01955031, 0.01959131]), array([3], dtype=uint64)), (array([0.01830862]), array([1], dtype=uint64)), (array([0.01927422, 0.01930474]), array([3], dtype=uint64)), (array([0.01842688]), array([1], dtype=uint64)), (array([0.01934336, 0.01917456]), array([3], dtype=uint64)), (array([0.018414 , 0.01840876]), array([3], dtype=uint64)), (array([0.01935576, 0.01924084]), array([3], dtype=uint64)), (array([0.01839398, 0.01840304]), array([3], dtype=uint64)), (array([0.01930474, 0.01928519]), array([3], dtype=uint64)), (array([0.0183482]), array([1], dtype=uint64)), (array([0.01919602, 0.01917313]), array([3], dtype=uint64)), (array([0.01838063]), array([1], dtype=uint64)), (array([0.01911495, 0.01908301]), array([3], dtype=uint64)), (array([0.01833342, 0.01833294]), array([3], dtype=uint64)), (array([0.01902483, 0.01903055]), array([3], dtype=uint64)), (array([0.01832054, 0.01832245]), array([3], dtype=uint64)), (array([0.01890133, 0.01894282]), array([3], dtype=uint64)), (array([0.01829718]), array([1], dtype=uint64)), (array([0.01876972]), array([1], dtype=uint64)), (array([0.01830147]), array([1], dtype=uint64)), (array([0.018692]), array([1], dtype=uint64)), (array([0.01822947, 0.0182433 ]), array([3], dtype=uint64)), (array([0.01868342, 0.01871155]), array([3], dtype=uint64)), (array([0.0183544 , 0.01858614, 0.01869438]), array([7], dtype=uint64))]] Traceback (most recent call last): File "/home/jmb001/.conda/envs/ecmwf/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 2022, in shape result = a.shape AttributeError: 'list' object has no attribute 'shape'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/jmb001/jmscript/PY/ECMWF/EARTHKIT_POLYTOPE/3D_shipping_route_era5_mabl_specific_humidity_mathilde.py", line 169, in a.test_reduced_ll_grid() File "/home/jmb001/jmscript/PY/ECMWF/EARTHKIT_POLYTOPE/3D_shipping_route_era5_mabl_specific_humidity_mathilde.py", line 117, in test_reduced_ll_grid result = self.API.retrieve(request) File "/home/jmb001/.conda/envs/ecmwf/lib/python3.10/site-packages/polytope/polytope.py", line 59, in retrieve self.datacube.get(request_tree) File "/home/jmb001/.conda/envs/ecmwf/lib/python3.10/site-packages/polytope/datacube/backends/fdb.py", line 75, in get self.assign_fdb_output_to_nodes(output_values, fdb_requests_decoding_info) File "/home/jmb001/.conda/envs/ecmwf/lib/python3.10/site-packages/polytope/datacube/backends/fdb.py", line 257, in assign_fdb_output_to_nodes print("JMB fdb.py assign_fdb_output_to_nodes: np.shape(request_output_values)=",np.shape(request_output_values)) File "/home/jmb001/.conda/envs/ecmwf/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 2024, in shape result = asarray(a).shape ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 3 dimensions. The detected shape was (1, 49, 2) + inhomogeneous part. GRIBJUMP_DEBUGGRIBJUMP_DEBUG Thread Thread 22435342517824 stopping (queue closed)22435554211392 stopping (queue closed)

Thread Thread 22435342517824 stopping (queue closed)22435554211392 stopping (queue closed)

It seems we got output values from GribJump now and they sure look like surface specific humidities around 18-19 g/Kg ... which is nice. There's a glitch remaining ...which appears minor ?

What is the meaning of "resolution" :360 then ? I thought we had to specify the data resolution (0.25 deg in this case). I guess this information is retrieved from FDB (GRIB) then.

I was also wondering about request about NEMO output. The software is not able to deal with such a grid yet (ORCA025), am I right ?

Anyway. Your help is much appreciated!

Jean-Marc
:-)

mathleur commented 2 months ago

Hi @jmb64 ,

This is great! It looks like it's almost working :)

I believe maybe this print line in the code might be causing issues:

print("JMB fdb.py assign_fdb_output_to_nodes: np.shape(request_output_values)=",np.shape(request_output_values))

One of the problems of Polytope which happens here is that the output of Polytope isn't necessarily "box"-shaped so most of the numpy functions like shape() do not work on Polytope's output!

We've also just updated the develop branch of Polytope yesterday which might fix some of the last bugs!

For the resolution, it's currently maybe a bit counterintuitive. It was initially built around the octahedral grid, where the resolution was the number x in Ox, so somewhat related to the number of data points on a latitude line. We've tried to keep it somewhat "uniform" for the different grids and so the resolution for the regular lat/lon grid is the number of points on a latitude line (in this case it's 90/0.25=360). I hope this explains a bit more!

Unfortunately, I do not believe we support ORCA grids yet.

I hope this helps!

jmb64 commented 2 months ago

Mathilde,

I assure you ... this helps ! :-)

Screenshot 2024-06-06 142706

How can I reference the actual values (specific humidities) along the trajectory) in the script (duhhh) ? I wrote them in fdb.py but ....

JM

jmb64 commented 2 months ago

Mathilde,

Finally,

Using leaves[i].result (?) ...

Screenshot 2024-06-06 151209

JM