stac-utils / stactools

Command line utility and Python library for STAC
https://stactools.readthedocs.io/
Other
104 stars 28 forks source link

Occasional "hangs" with `FsspecStacIO` #457

Open TomAugspurger opened 1 year ago

TomAugspurger commented 1 year ago

Describe the bug

Dumping this less than complete bug report here. I'll try to fill in more details later.

We're using FsspecStacIO via stacstools.sentinel2 and are observing regular "hangs". Here are the logs from the process creating the STAC items for a bunch of scenes:

[INFO] 2023-08-15 21:55:48,346 - (000.06%) [0.78s]  - blob://sentinel2l2a01/sentinel2-l2/36/S/VC/2023/05/08/S2B_MSIL2A_20230508T082609_N0509_R021_T36SVC_20230508T135506.SAFE/manifest.safe (1 of 1684)
[INFO] 2023-08-15 21:55:48,347 - Created item
[INFO] 2023-08-15 21:55:49,103 - (000.12%) [0.69s]  - blob://sentinel2l2a01/sentinel2-l2/36/S/WB/2023/05/08/S2B_MSIL2A_20230508T082609_N0509_R021_T36SWB_20230508T133704.SAFE/manifest.safe (2 of 1684)
[INFO] 2023-08-15 21:55:49,103 - Created item
[INFO] 2023-08-15 22:00:50,805 - (000.18%) [301.67s]  - blob://sentinel2l2a01/sentinel2-l2/36/S/XB/2023/05/08/S2B_MSIL2A_20230508T082609_N0509_R021_T36SXB_20230508T133650.SAFE/manifest.safe (3 of 1684)
[INFO] 2023-08-15 22:00:50,805 - Created item
[INFO] 2023-08-15 22:00:51,453 - (000.24%) [0.62s]  - blob://sentinel2l2a01/sentinel2-l2/01/V/CC/2023/05/07/S2A_MSIL2A_20230507T231551_N0509_R087_T01VCC_20230508T070652.SAFE/manifest.safe (4 of 1684)
[INFO] 2023-08-15 22:00:51,453 - Created item

The ~300s item creation is suspiciously close to aiohttp's (the HTTP library used internally by fsspec) default 300s timeout plus the normal item creation time. I haven't been able to debug exactly what the issue is yet.

To reproduce

Something like occastionally reproduces it, but see below for caveats:

``` import pystac from stactools.sentinel2 import stac import planetary_computer from stactools.core.utils.antimeridian import Strategy lines = [ '47/D/PG/2023/01/01/S2B_MSIL2A_20230101T023549_N0400_R060_T47DPG_20230101T153257.SAFE', '44/T/QK/2023/01/01/S2B_MSIL2A_20230101T052229_N0400_R062_T44TQK_20230101T183526.SAFE', '47/P/QQ/2023/01/01/S2B_MSIL2A_20230101T034139_N0400_R061_T47PQQ_20230101T180408.SAFE', '49/S/BB/2023/01/01/S2B_MSIL2A_20230101T034139_N0400_R061_T49SBB_20230101T165453.SAFE', '28/H/EC/2023/01/01/S2B_MSIL2A_20230101T105029_N0400_R065_T28HEC_20230101T211552.SAFE', '28/H/FC/2023/01/01/S2B_MSIL2A_20230101T105029_N0400_R065_T28HFC_20230102T002315.SAFE', '28/H/GC/2023/01/01/S2B_MSIL2A_20230101T105029_N0400_R065_T28HGC_20230102T002146.SAFE', '31/S/GV/2023/01/01/S2B_MSIL2A_20230101T102339_N0400_R065_T31SGV_20230101T224703.SAFE', '29/H/KT/2023/01/01/S2B_MSIL2A_20230101T105029_N0400_R065_T29HKT_20230101T220421.SAFE', '29/S/QS/2023/01/01/S2A_MSIL2A_20230101T111451_N0400_R137_T29SQS_20230102T001441.SAFE', '32/S/KD/2023/01/01/S2B_MSIL2A_20230101T102339_N0400_R065_T32SKD_20230101T231854.SAFE', '32/S/KE/2023/01/01/S2B_MSIL2A_20230101T102339_N0400_R065_T32SKE_20230101T215858.SAFE', '32/S/LD/2023/01/01/S2B_MSIL2A_20230101T102339_N0400_R065_T32SLD_20230101T214510.SAFE', '32/S/LE/2023/01/01/S2B_MSIL2A_20230101T102339_N0400_R065_T32SLE_20230101T235039.SAFE', '33/P/UR/2023/01/01/S2A_MSIL2A_20230101T093411_N0400_R136_T33PUR_20230102T001951.SAFE', '32/P/QB/2023/01/01/S2A_MSIL2A_20230101T093411_N0400_R136_T32PQB_20230102T000130.SAFE', '13/T/DG/2023/01/01/S2A_MSIL2A_20230101T175741_N0400_R141_T13TDG_20230102T042809.SAFE', '13/T/DH/2023/01/01/S2A_MSIL2A_20230101T175741_N0400_R141_T13TDH_20230102T045803.SAFE', '13/T/EG/2023/01/01/S2A_MSIL2A_20230101T175741_N0400_R141_T13TEG_20230102T062330.SAFE', '13/T/EH/2023/01/01/S2A_MSIL2A_20230101T175741_N0400_R141_T13TEH_20230102T045007.SAFE' ] for line in lines: print(line) granule_href = f"https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/{line}" item: pystac.Item = stac.create_item( granule_href=granule_href, read_href_modifier=planetary_computer.sign, antimeridian_strategy=Strategy.NORMALIZE, coordinate_precision=7, ) ``` Unfortunately, I've only been able to reproduce it in an environment where the process running the command is in the same Azure region as the data (West Europe in this case). I haven't been able to reproduce it in an environment where I can actually inspect the process to see what's going on. When it does hang, here's the traceback: ```pytb --------------------------------------------------------------------------- KeyboardInterrupt Traceback (most recent call last) Cell In[8], line 32 30 print(line) 31 granule_href = f"[https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/{](https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/%7Bline)[line](https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/%7Bline)}" ---> 32 item: pystac.Item = stac.create_item( 33 granule_href=granule_href, 34 read_href_modifier=planetary_computer.sign, 35 antimeridian_strategy=Strategy.NORMALIZE, 36 coordinate_precision=7, 37 ) File /srv/conda/envs/notebook/lib/python3.11/site-packages/stactools/sentinel2/stac.py:57, in create_item(granule_href, additional_providers, read_href_modifier, antimeridian_strategy, coordinate_precision) 53 safe_manifest = SafeManifest(granule_href, read_href_modifier) 55 product_metadata = ProductMetadata(safe_manifest.product_metadata_href, 56 read_href_modifier) ---> 57 granule_metadata = GranuleMetadata(safe_manifest.granule_metadata_href, 58 read_href_modifier) 60 item = pystac.Item( 61 id=product_metadata.scene_id, 62 geometry=product_metadata.geometry, (...) 65 properties={}, 66 ) 68 # --Common metadata-- File /srv/conda/envs/notebook/lib/python3.11/site-packages/stactools/sentinel2/granule_metadata.py:22, in GranuleMetadata.__init__(self, href, read_href_modifier) 17 def __init__(self, 18 href, 19 read_href_modifier: Optional[ReadHrefModifier] = None): 20 self.href = href ---> 22 self._root = XmlElement.from_file(href, read_href_modifier) 24 geocoding_node = self._root.find('n1:Geometric_Info/Tile_Geocoding') 25 if geocoding_node is None: File /srv/conda/envs/notebook/lib/python3.11/site-packages/stactools/core/io/xml.py:74, in XmlElement.from_file(cls, href, read_href_modifier) 70 @classmethod 71 def from_file( 72 cls, href: str, read_href_modifier: Optional[ReadHrefModifier] = None 73 ) -> "XmlElement": ---> 74 text = read_text(href, read_href_modifier) 75 return cls(etree.fromstring(bytes(text, encoding="utf-8"))) File /srv/conda/envs/notebook/lib/python3.11/site-packages/stactools/core/io/__init__.py:20, in read_text(href, read_href_modifier) 18 return StacIO.default().read_text(href) 19 else: ---> 20 return StacIO.default().read_text(read_href_modifier(href)) File /srv/conda/envs/notebook/lib/python3.11/site-packages/pystac/stac_io.py:279, in DefaultStacIO.read_text(self, source, *_, **__) 274 """A concrete implementation of :meth:`StacIO.read_text 275 `. Converts the ``source`` argument to a string (if it 276 is not already) and delegates to :meth:`DefaultStacIO.read_text_from_href` for 277 opening and reading the file.""" 278 href = str(os.fspath(source)) --> 279 return self.read_text_from_href(href) File /srv/conda/envs/notebook/lib/python3.11/site-packages/stactools/core/io/__init__.py:25, in FsspecStacIO.read_text_from_href(self, href, *args, **kwargs) 24 def read_text_from_href(self, href: str, *args: Any, **kwargs: Any) -> str: ---> 25 with fsspec.open(href, "r") as f: 26 s = f.read() 27 if isinstance(s, str): File /srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/core.py:102, in OpenFile.__enter__(self) 99 def __enter__(self): 100 mode = self.mode.replace("t", "").replace("b", "") + "b" --> 102 f = self.fs.open(self.path, mode=mode) 104 self.fobjects = [f] 106 if self.compression is not None: File /srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/spec.py:1241, in AbstractFileSystem.open(self, path, mode, block_size, cache_options, compression, **kwargs) 1239 else: 1240 ac = kwargs.pop("autocommit", not self._intrans) -> 1241 f = self._open( 1242 path, 1243 mode=mode, 1244 block_size=block_size, 1245 autocommit=ac, 1246 cache_options=cache_options, 1247 **kwargs, 1248 ) 1249 if compression is not None: 1250 from fsspec.compression import compr File /srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/implementations/http.py:356, in HTTPFileSystem._open(self, path, mode, block_size, autocommit, cache_type, cache_options, size, **kwargs) 354 kw["asynchronous"] = self.asynchronous 355 kw.update(kwargs) --> 356 size = size or self.info(path, **kwargs)["size"] 357 session = sync(self.loop, self.set_session) 358 if block_size and size: File /srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/asyn.py:121, in sync_wrapper..wrapper(*args, **kwargs) 118 @functools.wraps(func) 119 def wrapper(*args, **kwargs): 120 self = obj or args[0] --> 121 return sync(self.loop, func, *args, **kwargs) File /srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/asyn.py:94, in sync(loop, func, timeout, *args, **kwargs) 91 asyncio.run_coroutine_threadsafe(_runner(event, coro, result, timeout), loop) 92 while True: 93 # this loops allows thread to get interrupted ---> 94 if event.wait(1): 95 break 96 if timeout is not None: File /srv/conda/envs/notebook/lib/python3.11/threading.py:622, in Event.wait(self, timeout) 620 signaled = self._flag 621 if not signaled: --> 622 signaled = self._cond.wait(timeout) 623 return signaled File /srv/conda/envs/notebook/lib/python3.11/threading.py:324, in Condition.wait(self, timeout) 322 else: 323 if timeout > 0: --> 324 gotit = waiter.acquire(True, timeout) 325 else: 326 gotit = waiter.acquire(False) KeyboardInterrupt: ```

Here are the versions of the relevant packages

aiohttp                       3.8.4
fsspec                        2023.6.0
stactools                     0.3.1
stactools-sentinel2           0.2.1

To workaround

At least for our use case, we can disable the use of fsspec by running

pystac.StacIO.set_default(pystac.stac_io.DefaultStacIO)

after stactools.sentienl2 is imported. That workaround might help out others.