umr-lops / cdsodatacli

odata client for Copernicus Data Space catalog
https://cerweb.ifremer.fr/datarmor/doc_sphinx/cdsodatacli/index.html
MIT License
3 stars 0 forks source link

Using mode="multi" triggers KeyError #64

Open Skealz opened 9 months ago

Skealz commented 9 months ago

🐛 Bug Report

When querying using a gdf, and using mode="multi". It is systematic, for each fetch_data call. For the same queries using mode="seq" this doesn't happen.

Traceback (most recent call last):
  File "/home1/datahome/oarcher/storm_watch/./bt2sar_new.py", line 216, in <module>
    safes = cdsodatacli.query.fetch_data(gdf, timedelta_slice=datetime.timedelta(weeks=1), min_sea_percent=15, mode="m
ulti") 
  File "/home1/datahome/oarcher/storm_watch/conda_bt2sar_new/lib/python3.10/site-packages/cdsodatacli/query.py", line 
169, in fetch_data
    data_dedup = remove_duplicates(safes_ori=collected_data)
  File "/home1/datahome/oarcher/storm_watch/conda_bt2sar_new/lib/python3.10/site-packages/cdsodatacli/query.py", line 
683, in remove_duplicates
    safes_sort = safes_ori.sort_values("ModificationDate", ascending=False)
  File "/home1/datahome/oarcher/storm_watch/conda_bt2sar_new/lib/python3.10/site-packages/pandas/util/_decorators.py", line 331, in wrapper
    return func(*args, **kwargs)
  File "/home1/datahome/oarcher/storm_watch/conda_bt2sar_new/lib/python3.10/site-packages/pandas/core/frame.py", line 6912, in sort_values
    k = self._get_label_or_level_values(by, axis=axis)
  File "/home1/datahome/oarcher/storm_watch/conda_bt2sar_new/lib/python3.10/site-packages/pandas/core/generic.py", lin
e 1850, in _get_label_or_level_values
    raise KeyError(key)
KeyError: 'ModificationDate'
agrouaze commented 9 months ago

Thanks for the report, I will investigate it.

agrouaze commented 8 months ago

I tried to reproduce the error with the latest version of cdsodatacli with this snippet:

import datetime
import cdsodatacli
import geopandas as gpd
import shapely
import logging
if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser(description="highleveltest-debug64")
    parser.add_argument("--verbose", action="store_true", default=False)
    args = parser.parse_args()
    fmt = "%(asctime)s %(levelname)s %(filename)s(%(lineno)d) %(message)s"
    if args.verbose:
        logging.basicConfig(
            level=logging.DEBUG, format=fmt, datefmt="%d/%m/%Y %H:%M:%S", force=True
        )
    else:
        logging.basicConfig(
            level=logging.INFO, format=fmt, datefmt="%d/%m/%Y %H:%M:%S", force=True
        )
    startdate = datetime.datetime(2024,3,11,10)
    stopdate = datetime.datetime(2024,3,11,15)
    pola = '1SDV'
    mode = 'IW'
    product = 'SLC'
    gdf = gpd.GeoDataFrame(
            {
                "start_datetime": [startdate],
                "end_datetime": [stopdate],
                "geometry": [
                    shapely.wkt.loads(
                        "POLYGON ((-180 90, 180 90, 180 -90, -180 -90, -180 90))"
                    )
                ],
                "collection": ["SENTINEL-1"],
                "name": [pola],
                "sensormode": [mode],
                "producttype": [product],
                "Attributes": [None],
            }
        )
    safes = cdsodatacli.query.fetch_data(gdf, timedelta_slice=datetime.timedelta(weeks=1), min_sea_percent=15, mode="multi")
    logging.info('safes: %s',safes)
but, with `mode="multi"` and `mode="seq"` the result is the same:
    11/03/2024 16:51:07 INFO query.py(377) normalize_gdf processing time:0.005191326141357422s
11/03/2024 16:51:07 INFO query.py(152) Length of input after slicing in time:1
11/03/2024 16:51:07 INFO query.py(458) create_urls() processing time:0.0s
11/03/2024 16:51:07 INFO query.py(160) maximum // queries : 10
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.09s/it]
11/03/2024 16:51:08 INFO query.py(612) counter: defaultdict(<class 'int'>, {'urls_tested': 1, 'urls_OK': 1, 'product_proposed_by_CDS': 62, 'answer_append': 1})
11/03/2024 16:51:08 INFO query.py(655) nb duplicate removed: 0
11/03/2024 16:51:08 INFO query.py(656) remove_duplicates processing time:0.0 sec
11/03/2024 16:51:08 INFO query.py(170) number of product after removing duplicates: 62
/opt/conda-envs/dev/lib/python3.10/site-packages/geopandas/geoseries.py:645: FutureWarning: the convert_dtype parameter is deprecated and will be removed in a future version.  Do ``ser.astype(object).apply()`` instead if you want ``convert_dtype=False``.
  result = super().apply(func, convert_dtype=convert_dtype, args=args, **kwargs)
11/03/2024 16:51:08 INFO query.py(678) multi_to_poly processing time:0.005716085433959961s
11/03/2024 16:51:08 INFO query.py(174) number of product after removing multipolygon: 62
/home1/datahome/agrouaze/sources/git/cdsodatacli/cdsodatacli/query.py:702: UserWarning: Geometry is in a geographic CRS. Results from 'area' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.

  collected_data.geometry.area
/home1/datahome/agrouaze/sources/git/cdsodatacli/cdsodatacli/query.py:703: UserWarning: Geometry is in a geographic CRS. Results from 'area' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.

  - collected_data.geometry.intersection(earth).area
/home1/datahome/agrouaze/sources/git/cdsodatacli/cdsodatacli/query.py:705: UserWarning: Geometry is in a geographic CRS. Results from 'area' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.

  / collected_data.geometry.area
11/03/2024 16:51:09 INFO query.py(712) sea_percent processing time:0.15232086181640625s
11/03/2024 16:51:09 INFO query.py(183) number of product after adding sea percent: 15
11/03/2024 16:51:09 INFO debug_issue64.py(42) safes:      @odata.mediaContentType                                    Id                                               Name  ... id_original_query                                           geometry sea_percent
61  application/octet-stream  d62daea7-40d3-40df-aa33-670b736a4399  S1A_IW_SLC__1SDV_20240311T125115_20240311T1251...  ...                 0  POLYGON ((-103.46991 17.58982, -103.17579 19.0...   81.295818
20  application/octet-stream  3960adbe-9ade-4d18-b38c-6f4045363073  S1A_IW_SLC__1SDV_20240311T121223_20240311T1212...  ...                 0  POLYGON ((87.79518 20.23797, 90.20159 20.66703...   79.640037
19  application/octet-stream  4b99d6fa-e17a-479a-89e7-49941aa043a2  S1A_IW_SLC__1SDV_20240311T121156_20240311T1212...  ...                 0  POLYGON ((88.14330 18.57141, 90.52374 19.00413...  100.000000
6   application/octet-stream  55d44ee1-ec0f-44b4-a312-210343c46392  S1A_IW_SLC__1SDV_20240311T103344_20240311T1034...  ...                 0  POLYGON ((112.41589 20.56098, 114.80056 20.984...   74.684156
13  application/octet-stream  bd1c867a-7341-462b-a618-43d1f90b0603  S1A_IW_SLC__1SDV_20240311T111017_20240311T1110...  ...                 0  POLYGON ((-77.16768 25.26724, -76.77778 27.066...   85.305510
15  application/octet-stream  9650763b-bdb7-4aa6-8759-a03148e6f853  S1A_IW_SLC__1SDV_20240311T111110_20240311T1111...  ...                 0  POLYGON ((-77.81801 22.21885, -77.47214 23.846...   92.006985
14  application/octet-stream  d7a6cbdb-1217-4747-acbb-014f5c277f3e  S1A_IW_SLC__1SDV_20240311T111044_20240311T1111...  ...                 0  POLYGON ((-77.49950 23.71879, -77.13865 25.401...   83.201545
17  application/octet-stream  82402966-000d-46c0-b034-8c9d369c7024  S1A_IW_SLC__1SDV_20240311T111504_20240311T1115...  ...                 0  POLYGON ((-80.78135 7.92986, -80.41099 9.73319...   48.869423
0   application/octet-stream  c3a9fbf8-0248-44e4-b2c8-1274278a5162  S1A_IW_SLC__1SDV_20240311T102500_20240311T1025...  ...                 0  POLYGON ((119.10242 -11.12103, 121.35142 -10.6...   80.508051
1   application/octet-stream  0449d576-d9ca-4be8-9e9b-5095dbcca0a9  S1A_IW_SLC__1SDV_20240311T102528_20240311T1025...  ...                 0  POLYGON ((118.73262 -9.45508, 120.96767 -8.949...   82.695412
3   application/octet-stream  3fd93bd1-717a-4ba8-a1f0-52591f6c451d  S1A_IW_SLC__1SDV_20240311T102637_20240311T1027...  ...                 0  POLYGON ((117.80157 -5.29244, 120.02602 -4.799...   84.018513
2   application/octet-stream  c5c69103-edbd-4124-be0d-9048c116d0a2  S1A_IW_SLC__1SDV_20240311T102609_20240311T1026...  ...                 0  POLYGON ((118.16367 -6.95977, 120.39546 -6.460...   86.531591
4   application/octet-stream  fed918d3-5262-4090-801b-4dbf6fe5bbd1  S1A_IW_SLC__1SDV_20240311T102702_20240311T1027...  ...                 0  POLYGON ((117.47078 -3.73411, 119.68848 -3.246...   77.722250
18  application/octet-stream  10728162-d7e8-4580-a030-7cae31de8ea3  S1A_IW_SLC__1SDV_20240311T111531_20240311T1115...  ...                 0  POLYGON ((-80.91052 7.26091, -80.74564 8.06288...   64.390392
5   application/octet-stream  aa79151f-5de5-4331-a471-fcc02b5a0f6a  S1A_IW_SLC__1SDV_20240311T102727_20240311T1027...  ...                 0  POLYGON ((117.14751 -2.23347, 119.36343 -1.750...   99.091778

[15 rows x 19 columns]