NeurodataWithoutBorders / nwb_benchmarks

Benchmarking for NWB-related operations.
https://nwb-benchmarks.readthedocs.io/en/latest/
Other
4 stars 1 forks source link

Add test configurations for Zarr tests #52

Open oruebel opened 6 months ago

oruebel commented 6 months ago

Add test configurations for:

CodyCBakerPhD commented 6 months ago

@oruebel I see

             Traceback (most recent call last):
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\s3fs\core.py", line 113, in _error_wrapper
                 return await func(*args, **kwargs)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\aiobotocore\client.py", line 408, in _make_api_call
                 raise error_class(parsed_response, operation_name)
             botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist.

             The above exception was the direct cause of the following exception:

             Traceback (most recent call last):
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\fsspec\mapping.py", line 155, in __getitem__
                 result = self.fs.cat(k)
                          ^^^^^^^^^^^^^^
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\fsspec\asyn.py", line 118, in wrapper
                 return sync(self.loop, func, *args, **kwargs)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\fsspec\asyn.py", line 103, in sync
                 raise return_result
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\fsspec\asyn.py", line 56, in _runner
                 result[0] = await coro
                             ^^^^^^^^^^
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\fsspec\asyn.py", line 461, in _cat
                 raise ex
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\fsspec\asyn.py", line 245, in _run_coro
                 return await asyncio.wait_for(coro, timeout=timeout), i
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\asyncio\tasks.py", line 452, in wait_for
                 return await fut
                        ^^^^^^^^^
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\s3fs\core.py", line 1125, in _cat_file
                 return await _error_wrapper(_call_and_read, retries=self.retries)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\s3fs\core.py", line 142, in _error_wrapper
                 raise err
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\s3fs\core.py", line 113, in _error_wrapper
                 return await func(*args, **kwargs)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\s3fs\core.py", line 1112, in _call_and_read
                 resp = await self._call_s3(
                        ^^^^^^^^^^^^^^^^^^^^
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\s3fs\core.py", line 362, in _call_s3
                 return await _error_wrapper(
                        ^^^^^^^^^^^^^^^^^^^^^
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\s3fs\core.py", line 142, in _error_wrapper
                 raise err
             FileNotFoundError: The specified key does not exist.

             During handling of the above exception, another exception occurred:

             Traceback (most recent call last):
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\zarr\storage.py", line 1441, in __getitem__
                 return self.map[key]
                        ~~~~~~~~^^^^^
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\fsspec\mapping.py", line 159, in __getitem__
                 raise KeyError(key)
             KeyError: '.zmetadata'

             The above exception was the direct cause of the following exception:

             Traceback (most recent call last):
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\asv\benchmark.py", line 68, in <module>
                 main()
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\asv\benchmark.py", line 60, in main
                 commands[mode](args)
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\asv_runner\run.py", line 72, in _run
                 result = benchmark.do_run()
                          ^^^^^^^^^^^^^^^^^^
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\asv_runner\benchmarks\_base.py", line 661, in do_run
                 return self.run(*self._current_params)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\asv_runner\benchmarks\time.py", line 165, in run
                 samples, number = self.benchmark_timing(
                                   ^^^^^^^^^^^^^^^^^^^^^^
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\asv_runner\benchmarks\time.py", line 258, in benchmark_timing
                 timing = timer.timeit(number)
                          ^^^^^^^^^^^^^^^^^^^^
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\timeit.py", line 180, in timeit
                 timing = self.inner(it, self.timer)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^
               File "<timeit-src>", line 6, in inner
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\asv_runner\benchmarks\time.py", line 90, in func
                 self.func(*param)
               File "D:\GitHub\nwb_benchmarks\src\nwb_benchmarks\benchmarks\time_remote_file_reading.py", line 234, in time_read_zarr
                 self.zarr_file = read_zarr(s3_url=s3_url, open_without_consolidated_metadata=False)
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
               File "D:\GitHub\nwb_benchmarks\src\nwb_benchmarks\core\_streaming.py", line 240, in read_zarr
                 zarrfile = zarr.open_consolidated(store=s3_url, mode="r", storage_options=dict(anon=True))
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\zarr\convenience.py", line 1335, in open_consolidated
                 meta_store = ConsolidatedStoreClass(store, metadata_key=metadata_key)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\zarr\storage.py", line 2974, in __init__
                 meta = json_loads(self.store[metadata_key])
                                   ~~~~~~~~~~^^^^^^^^^^^^^^
               File "C:\Users\theac\anaconda3\envs\nwb_benchmarks\Lib\site-packages\zarr\storage.py", line 1443, in __getitem__
                 raise KeyError(key) from e
             KeyError: '.zmetadata'

when attempting to run the current benchmarks in a fresh environment - perhaps a certain version of hdmf-zarr is needed? Or some other dependency needs to be more precisely pinned in the environment file?

oruebel commented 6 months ago

when attempting to run the current benchmarks in a fresh environment - perhaps a certain version of hdmf-zarr is needed? Or some other dependency needs to be more precisely pinned in the environment file?

The .zmetadata error is expected. This is due to change we made for the read_zarr method to fail if we attempt to read with consolidated metadata when it is not present. We should only use Zarr files that have consolidated metadata for the test suite since we can explicitly ignore it if don't want to use it. However, I didn't know a Zarr file with consolidated metadata on DANDI when I setup the tests. Once we have updated the parametrization of the tests to use the new Zarr files, this error should go away.