Closed ahmadtourei closed 1 year ago
Darn, can you share one of those files with me so I can see what's going on? Or is it the same at the last one you sent over?
Just sent. Thanks!
Index file is created. However, I got a "CoordDataError" on getting a patch out of the spool:
---------------------------------------------------------------------------
CoordDataError Traceback (most recent call last)
Cell In[7], line 2
1 # get sampling rate, channel spacing, and gauge length from the first patch
----> 2 patch_0 = sp[0]
3 gauge_length = patch_0.attrs['gauge_length']
4 print("Gauge length = ", gauge_length)
File [~/coding/dascore/dascore/core/spool.py:176](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/final_codes_to_upload/~/coding/dascore/dascore/core/spool.py:176), in DataFrameSpool.__getitem__(self, item)
175 def __getitem__(self, item):
--> 176 out = self._get_patches_from_index(item)
177 # a single index was used, should return a single patch
178 if not isinstance(item, slice):
File [~/coding/dascore/dascore/core/spool.py:214](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/final_codes_to_upload/~/coding/dascore/dascore/core/spool.py:214), in DataFrameSpool._get_patches_from_index(self, df_ind)
212 assert not df1.empty
213 joined = df1.join(source.drop(columns=df1.columns, errors="ignore"))
--> 214 return self._patch_from_instruction_df(joined)
File [~/coding/dascore/dascore/core/spool.py:224](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/final_codes_to_upload/~/coding/dascore/dascore/core/spool.py:224), in DataFrameSpool._patch_from_instruction_df(self, joined)
221 for patch_kwargs in df_dict_list:
222 # convert kwargs to format understood by parser/patch.select
223 kwargs = _convert_min_max_in_kwargs(patch_kwargs, joined)
--> 224 patch = self._load_patch(kwargs)
225 # apply any trimming needed on patch
226 select_kwargs = {
227 i: v
228 for i, v in kwargs.items()
229 if i in patch.dims or i in patch.coords.coord_map
230 }
File [~/coding/dascore/dascore/clients/dirspool.py:134](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/final_codes_to_upload/~/coding/dascore/dascore/clients/dirspool.py:134), in DirectorySpool._load_patch(self, kwargs)
132 final_kwargs = dict(kwargs)
133 final_kwargs.update(self._select_kwargs)
--> 134 patch = dc.read(**final_kwargs)[0]
135 return patch
File [~/coding/dascore/dascore/io/core.py:507](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/final_codes_to_upload/~/coding/dascore/dascore/io/core.py:507), in read(path, file_format, file_version, time, distance, **kwargs)
505 required_type = formatter.read._required_type
506 path = man.get_resource(required_type)
--> 507 out = formatter.read(
508 path,
509 file_version=file_version,
510 time=time,
511 distance=distance,
...
636 )
--> 637 raise CoordDataError(msg)
638 return data
CoordDataError: Data array has a shape of (5099, 16384) which doesnt match the coordinate manager shape of (16384, 5099).
Thanks for finding this! So it turns out some prodML files have time/distance and other distance/time dimension ordering. We just assumed it would always be the same.
Hey @ahmadtourei,
Should be fixed now. Please take it for a spin and let me know if not.
The index file is not created for these PRODML v.2.0 format and below error raised:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[2], line 9
6 # get the spool of data form the defined data path (will index patches for the first time)
7 sp = dc.spool(data_path)
----> 9 print(sp)
11 # print the contents of first 5 patches
12 # content_df = sp.get_contents()
13 # content_df.head()
File [~/coding/dascore/dascore/core/spool.py:65](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/final_codes_to_upload/2_test_latest_master_branch/~/coding/dascore/dascore/core/spool.py:65), in BaseSpool.__str__(self)
64 def __str__(self):
---> 65 return str(self.__rich__())
File [~/coding/dascore/dascore/clients/dirspool.py:67](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/final_codes_to_upload/2_test_latest_master_branch/~/coding/dascore/dascore/clients/dirspool.py:67), in DirectorySpool.__rich__(self)
65 def __rich__(self):
66 """Augment rich string directory spool stuff."""
---> 67 base = super().__rich__()
68 path = self.indexer.path
69 kwargs = self._select_kwargs
File [~/coding/dascore/dascore/core/spool.py:59](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/final_codes_to_upload/2_test_latest_master_branch/~/coding/dascore/dascore/core/spool.py:59), in BaseSpool.__rich__(self)
57 text += Text(self.__class__.__name__, style=self._rich_style)
58 text += Text(" 🧵 ")
---> 59 patch_len = len(self)
60 text += Text(f"({patch_len:d}")
61 text += Text(" Patches)") if patch_len != 1 else Text(" Patch)")
File [~/coding/dascore/dascore/core/spool.py:326](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/final_codes_to_upload/2_test_latest_master_branch/~/coding/dascore/dascore/core/spool.py:326), in DataFrameSpool.__len__(self)
325 def __len__(self):
--> 326 return len(self._df)
File [~/coding/dascore/dascore/utils/misc.py:263](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/final_codes_to_upload/2_test_latest_master_branch/~/coding/dascore/dascore/utils/misc.py:263), in CacheDescriptor.__get__(self, instance, owner)
261 if self._name not in cache:
262 func = getattr(instance, self._func_name)
--> 263 out = func(*self._args, **self._kwargs)
264 cache[self._name] = out
265 return cache[self._name]
File [~/coding/dascore/dascore/clients/dirspool.py:77](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/final_codes_to_upload/2_test_latest_master_branch/~/coding/dascore/dascore/clients/dirspool.py:77), in DirectorySpool._get_df(self)
74 def _get_df(self):
75 """Get the dataframe of current contents."""
76 out = adjust_segments(
---> 77 self._source_df, ignore_bad_kwargs=True, **self._select_kwargs
78 )
...
288 # takes care of other types as well as for example NROWS for
289 # Tables and EXTDIM for EArrays
290 format_version = self._v__format_version
AttributeError: Attribute 'RawDescription' does not exist in node: '/Acquisition/Raw[0]'
So is this regarding the test file named "DOSS_20220723T111500_430400Z.hdf5"? When I run this code in the same directory as that file:
import dascore as dc
spool = dc.spool(".").update()
patch = spool[0]
print(patch)
it works fine. Are you also on the current master branch? Perhaps there are other files you are using?
No, this is regarding the " BM73-22_500Hz_UTC_20230718_170000.h5". I'm on the master branch.
On Fri, Sep 1, 2023 at 3:51 PM Derrick Chambers @.***> wrote:
So is this regarding the test file named "DOSS_20220723T111500_430400Z.hdf5"? When I run this code in the same directory as that file:
import dascore as dc spool = dc.spool(".").update()patch = spool[0]print(patch)
it works fine. Are you also on the current master branch? Perhaps there are other files you are using?
— Reply to this email directly, view it on GitHub https://github.com/DASDAE/dascore/issues/221#issuecomment-1703353639, or unsubscribe https://github.com/notifications/unsubscribe-auth/AV57BNGVLXENJUHVGM2CGXDXYJKFHANCNFSM6AAAAAA3R2A424 . You are receiving this because you modified the open/close state.Message ID: @.***>
Description
The index file is not created in the data directory. So, I could not get the patches out of the spool. Data format: PRODML v. 2.0 format
Please note that no error occurred after getting the spool and indexing got to 100%:
Example
Expected behavior
Versions