Closed esheldon closed 2 months ago
I think the intended use of instance catalogs was to have only the objects that will appear on the image (or not too far off of the image) in each file. So reading that twice isn't so much overhead. I don't remember off hand the design constraints that led us to have it read twice, but it was probably just that the way we have been using the instance catalogs in the past, it wasn't much of a tall pole. I'm sure it's possible to have it only read once, but it's not a trivial change, I don't think.
I'm using the instcats that were used in real production I believe. These cover the full focal plane
Could we put an @lru_cache
on the function?
We don't use instcats anymore in production. We use SkyCatalogs.
Yes, I believe these were used in an older run.
I had to use instcats for this because what I was trying to do was not possible with sky catalogs (according to @jchiang87 )
I think the initial read is simply to get the number of objects for galsim to set seeds etc. for multiprocessing. We got around this problem with skyCatalogs by adding an approx_nobjects
option to the input.sky_catalog
yaml section: https://github.com/LSSTDESC/imSim/blob/main/imsim/skycat.py#L67
We could add the same option for the instance catalog code to avoid reading in the entire file.
Ah, that makes sense. Yeah, we could add that for instcat too. That would be pretty straightforward.
This was fixed with #465 and #466.
I'm seeing that instance catalogs are read twice.
When galaxies are present its taking about 20minutes to read the catalog, so doubling this is significant.
Below I added a timer to show the read time. You can see that it happens twice