Open NuggetOfficial opened 11 months ago
Hey!
Yes, these are part of a few things that are not intuitive with the current implementation. When first implemented, the way Numpy array subclassing worked, we couldn't implement this type of dunder method at all. That changed a few Numpy releases later, at which point I did not have time to convert.
Now that non-subclassing numpy arrays works with recent Numpy releases (old enough that everyone should have switched over already), I wanted to reimplement Fields as a standalone class, which has many other advantages, including being able to implement more specialized Grid conversions as you are doing here. That implementation is PR #187 which I would love feedback / testing support. Since you are clearly interested in doing some low-level HCIPy work, could we schedule a call to talk through things? I'll send you an email.
As an aside, the method that you're looking for is Grid.subset()
. Also, I'm interested in why you need to batch Fields in the first place. Not saying it's a bad thing, just that most of the time, the built-in methods are parallelized enough that batching only gains you a factor of 2-3x. Plus, multiprocessing is usually done best at the outer loop, which is often an iteration over wavelengths.
Ill also write a response here for consistency's sake. Would love to work on it! As for why im batching, im running a monochromatic pixel by pixel optimisation algorithm for a VVC. Retrieving the bulk model is already very fast, order of seconds, but pixel by pixel fit is currently quite slow, order of 100 pixels/s (I have a sneaking suspicion that pythons forloop overhead is one of the main culprits). This makes for an average computation time of 45 min per image or so, dividing that time by 3 would be fantastic.
Hey! I am running some optimisation code and I am currently considering batching a field and sending it to different cores. I am running into the issue that an indexed/sliced field does not return the grid corresponding to the index/slice. Consider the following:
The output I expected intuitively is a new
Field
object with a.size
of 200 and with a.grid.size
200; whenever we pick a number from a field the associated coordinate is returned along with it.However the current output is a
Field
object with a.size
of 200 and with a.grid.size
262144. In other words, the grid itself is not indexed but left alone. This is an afterproduct of creation definition of aField
:which causes any indexing of a field to only consider the input array.
I propose extending the
__getitem__
dunder of theField
class so that it calls the__getitem__
of both the contained array, in this caseself
, and the associated grid,self.grid
.I started working on the alteration but the thing I am struggeling with is how to turn the indexed grid back into the proper
<any>Grid
object that contains the correct coordinates. Currenlty calling the<any>Grid.__getitem__
method returns anndarray
with the corresponding coordinates. Any ideas on how to tackle this?This would make slicing a field in 4 so I can send the different slices to different cores a lot easier.