yt-project / yt

Main yt repository
http://yt-project.org
Other
469 stars 280 forks source link

[BUG] Better logic for detection of particle handle for checkpoint files #5019

Closed jzuhone closed 1 month ago

jzuhone commented 1 month ago

PR Summary

We do not support particle data in FLASH data before version 3, but we do support reading in FLASH 2.x datasets.

The current logic for detecting a FLASH particle file which corresponds to a FLASH plotfile checks the filename for the string "hdf5_plt" and replaces it with "hdf5_part". It then generates a file handler for the particle file:

https://github.com/yt-project/yt/blob/7ebe857f9c38215976ca058142a78a107d39e1e8/yt/frontends/flash/data_structures.py#L192-L206

This works just fine for plot and particle files that match, but this logic completely misses checkpoint files with the string "hdf5_chk". But this falls through silently, because it will not change the filename at all in line 196 above and will open a separate file handler for the same input file.

Then shortly after there is a check for the equality of the two file handles (which will fail because the same file has been opened two different times):

https://github.com/yt-project/yt/blob/7ebe857f9c38215976ca058142a78a107d39e1e8/yt/frontends/flash/data_structures.py#L208-L219

This does not create a problem for FLASH 3 files (aside from the extra and unnecessary file handle), but for FLASH 2.5 files we fail here because they do not have the "real scalars" HDF5 dataset:

import yt
ds = yt.load("co2djj_hdf5_chk_2396")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[2], line 1
----> 1 ds = yt.load("co2djj_hdf5_chk_2396")

File ~/Source/yt/yt/_maintenance/deprecation.py:69, in future_positional_only.<locals>.outer.<locals>.inner(*args, **kwargs)
     60     value = kwargs[name]
     61     issue_deprecation_warning(
     62         f"Using the {name!r} argument as keyword (on position {no}) "
     63         "is deprecated. "
   (...)
     67         **depr_kwargs,
     68     )
---> 69 return func(*args, **kwargs)

File ~/Source/yt/yt/loaders.py:149, in load(fn, hint, *args, **kwargs)
    141     if missing := cls._missing_load_requirements():
    142         warnings.warn(
    143             f"This dataset appears to be of type {cls.__name__}, "
    144             "but the following requirements are currently missing: "
   (...)
    147             stacklevel=3,
    148         )
--> 149     return cls(fn, *args, **kwargs)
    151 if len(candidates) > 1:
    152     raise YTAmbiguousDataType(_input_fn, candidates)

File ~/Source/yt/yt/frontends/flash/data_structures.py:210, in FLASHDataset.__init__(self, filename, dataset_type, storage_filename, particle_filename, units_override, unit_system, default_species_fields)
    208 # Check if the particle file has the same time
    209 if self._particle_handle != self._handle:
--> 210     part_time = self._particle_handle.handle.get("real scalars")[0][1]
    211     plot_time = self._handle.handle.get("real scalars")[0][1]
    212     if not np.isclose(part_time, plot_time):

TypeError: 'NoneType' object is not subscriptable

This PR addresses this problem by 1) Making sure that files with "hdf5_chk" in the filename are properly handled and 2) checking explicitly for files without "real scalars".

PR Checklist