Closed kyleaoman closed 1 month ago
In what way does it barf? I think that int in this context means just an integer as it's a type hint right? I suppose we could use typing.Integer
Ah, now not sure where I got the np.int64 from, but the following fails because it doesn't like a unyt_quantity:
import numpy as np
from os import path
from velociraptor import load as load_catalogue
from velociraptor.particles import load_groups
import unyt as u
snapnum = 23
base_dir = '/cosma/home/durham/dc-oman1/ColibreTestData/'\
'106e3_104b2_norm_0p3_new_cooling_L006N188/'
velociraptor_filebase = path.join(base_dir, 'halo_{:04d}'.format(snapnum))
snapshot_filename = path.join(base_dir, 'colibre_{:04d}.hdf5'.format(snapnum))
catalogue = load_catalogue(
path.join(base_dir, f'{velociraptor_filebase}.properties')
)
target_mask = np.logical_and.reduce((
catalogue.ids.hosthaloid == -1,
catalogue.velocities.vmax > 50 * u.km / u.s,
catalogue.velocities.vmax < 100 * u.km / u.s
))
target_halo_ids = catalogue.ids.id[target_mask]
target_halo_id = target_halo_ids[0]
groups = load_groups(
path.join(base_dir, f'{velociraptor_filebase}.catalog_groups'),
catalogue=catalogue
)
print(type(target_halo_id))
particles, unbound_particles = groups.extract_halo(halo_id=target_halo_id)
with:
<class 'unyt.array.unyt_quantity'>
Traceback (most recent call last):
File "swiftgalaxy.py", line 251, in <module>
particles, unbound_particles = groups.extract_halo(halo_id=target_halo_id)
File "/cosma7/ICC-data/dc-oman1/code/velociraptor-python/velociraptor/particles/particles.py", line 135, in extract_halo
number_of_particles = self.offset[halo_id + 1] - self.offset[halo_id]
IndexError: arrays used as indices must be of integer (or boolean) type
Getting the id programmatically from the group catalogue seems like a reasonable use-case ;)
Ah - sorry - this is very misleading because the halo_id here is actually just the position of the halo in the catalogue file not its unique ID.
It probably should be the unique id! In the examples I've seen the two are in principle interchangeable, but I guess there's no guarantee that this will hold, right?
Indeed, if you use a VR catalogue that was generated with the MPI version of the code this may not hold at all.
It probably should be the unique ID but that then requires storing a hashtable somewhere or doing an nlogn search every time you want to load a halo. So if you want to do something like loop through all haloes in the file then this may be wasted effort...
Should it maybe accept either an index or an id then? Alternately, I think as a minimal change the argument could just be renamed for clarity.
Happy for it to be renamed. Might also be nice to have a function or method somewhere that get_halo_index_from_id
or something (it could even cache the required hashtable).
Changed the name of the argument, think I covered all the locations where it's relevant, and updated the docs:
Would still be nice to have a get_halo_index_from_id
and/or accept a halo_id as an alternative argument, but haven't implemented that yet.
https://github.com/SWIFTSIM/velociraptor-python/blob/b168a6b497758ab6ede47be8aafd192e8b95ef68/velociraptor/particles/particles.py#L100
The requirement to provide an int for the halo_id here seems too strict. For example, it barfs on a
np.int64
.