Open peterdudfield opened 1 week ago
I'll take this up as well, I think it's correlated to https://github.com/openclimatefix/ocf_datapipes/issues/337
Apologies if this sounds trivial, but about the implementation, are we planning to log it or raise a ValueError?
Raise an error please
I was trying to trigger positive and negative test cases, by cloning and replacing nwp data. but there seems to be something at large with the data that i'm missing to figure out.
Reference:
for i in store:
zarr.copy(store[i], store_clone, name=i)
store_clone['UKV'][0, 0, 0]=0
KeyError: ".zmetadata\nThis exception is thrown by __iter__ of OpenNWPIterDataPipe(check_for_zeros=False, check_physical_limits=False, zarr_path='tests/data/nwp_data/test_with_zeros.zarr')"
can you help me point to the right direction, please?
That error can sometimes happen if the file you load isnt a zarr. Can you check it manually that is had the .zmetadata in tests/data/nwp_data/test_with_zeros.zarr
Yep, that worked :D
Apparently the way I was coloning the zarr, didnt carry over the metadata at .zmetadata
Also, in reference to #337, can you help me understand the physical limits for nwp.variable
UKV
?
to solve the above, I've manually copied .zmetadata
and .zarray
which works as i'm manually testing the test functions at test\load\nwp\test_load_nwp.py
:
def test_check_for_zeros():
#positive test case
nwp_datapipe = OpenNWP(
zarr_path="tests/data/nwp_data/test_with_zeros_n_limits.zarrr",
check_for_zeros=True)
with pytest.raises(ValueError): # checks for Error raised if NWP DataArray contains zeros
metadata = next(iter(nwp_datapipe))
#negative test case
nwp_datapipe = OpenNWP(
zarr_path="tests/data/nwp_data/test.zarr",
check_for_zeros=True)
metadata = next(iter(nwp_datapipe))
assert metadata is not None
there is something odd happening as I'm trying to run pytest
on ocf_datapipeline
. the files get automatically deleted, which ends up throwing the same error :/
KeyError: ".zmetadata\nThis exception is thrown by __iter__ of OpenNWPIterDataPipe(check_for_zeros=False, check_physical_limits=False, zarr_path='tests/data/nwp_data/test_with_zeros.zarr')"
Which files get delete?
Also do you mean zarr_path="tests/data/nwp_data/test_with_zeros_n_limits.zarr"
instead of zarr_path="tests/data/nwp_data/test_with_zeros_n_limits.zarrr"
?
tests/data/nwp_data/test_with_zeros_n_limits.zarr/.zmetadata
gets deleted as I'm running pytest
oh yes, my bad, it was a silly typo
did the typo fix it? Or is it still getting deleted?
did the typo fix it? Or is it still getting deleted?
Can you work out where the file gets deleted?
The typo dosent exist in the real code.
I'm trying to figure the same. There seeem to be some correlation with pytest
How odd, can you push code to a draft PR and I might be able to have a look at it?
sure thing I'll do that
If it helps, I've used this code to produce test_with_zeros_n_limits.zarr
:
import zarr
import shutil
import numpy as np
original_store_path = 'tests/data/nwp_data/test.zarr'
original_store = zarr.open(original_store_path, mode='r')
new_store_path = 'tests/data/nwp_data/test_with_zeros_n_limits.zarr'
shutil.rmtree(new_store_path, ignore_errors=True)
with zarr.open(new_store_path, mode='w') as new_store:
for item in original_store:
zarr.copy(original_store[item], new_store, name=item)
new_store['UKV'][0, 0, 0,0]=0
new_store['UKV'][0, 0, 0,1]=np.random.uniform(240, 350, size=(548,))
This script doesn't get run with pytest does it?
It should just be run once yea?
The script is supposed to run once, to generate the data. Do you think we'd be better off if it's run as part of test_load_nwp.py?
Perhaps its worth a try. And that would mean less test data is committed to the repo
alrighty, I'll make the necessary changes to the draft PR
Detailed Description
Check for zeros when loading the data. This should only be used when running inference
Context
Possible Implementation