cloudsci / cloudmetrics

Toolkit for computing 15+ metrics characterising 2D cloud patterns
16 stars 8 forks source link

refactor iorg for object-labels #54

Closed leifdenby closed 2 years ago

leifdenby commented 2 years ago

@martinjanssens I've started this work and as so often is the case it is more complicated than I thought :) The challenge here is around what to do when we're using data with periodic boundary conditions. As you originally wrote the code the utils.periodic function created a (nx*2, ny*2) copy of the original cloud-mask, identified objects in this mask, then computed the centroids of these objects and moved the centroids for the objects that were "outside" the original mask. This means that refactoring this code isn't as simple as just passing in the labelled-objects 2D array, since this the centroid calculation for example won't work for objects that straddle the boundary.

So this needs a bit more thought...

Completing https://github.com/cloudsci/cloudmetrics/issues/52

martinjanssens commented 2 years ago

I'm probably the slow one here: What's wrong with passing the object labelled array that you make after applying the make_periodic_domain function? That would be what we want to happen for all object-based metrics that have periodic BCs, right (and what is currently happening here? Could we do that for iorg as well? I think it worked before we started refactoring, as long as I moved the objects whose centroid was outside the original domain after applyingutils.periodic back into the original domain, cut off the extra space and applied the pairwise distance calculation in a periodic box. I'm probably overlooking something, though!

leifdenby commented 2 years ago

Sorry that wasn't clear above. The issue is in assuming all metric functions which operate on labelled objects don't need to know whether the labelled objects are on a periodic domain. If we start off with the following object mask in a periodic domain:

[      ]
........
........
M......M
M.....MM
........
..MM....
........
........

we'd end up with the following object labels (with object A straddling the boundary):

[      ]
................
................
.......AA.......
......AAA.......
................
..BB............
................
................
................
................
................
................
................
................
................
................

If we then call the iorg computation on these labelled objects, without also passing through that these labelled objects are defined with a 2D array which has been double in size in x- and y-direction, then iorg calculation will be wrong. You added functionality inside the iorg function to unwrap the object positions before, which is great, but to use it I have to pass an argument in to indicate that the labelled objects array represents objects in a periodic domain. I'm nearly there though I think :)

leifdenby commented 2 years ago

AH! I think this is finally ready for review @martinjanssens :partying_face: Hope you're happy with what I've done!

leifdenby commented 2 years ago

So for my understanding, just adding an additional check whether each individual metric function operating on object_labels takes periodic_domain as an input is enough? That's clever :)

Yes :) I finally realised that I could do just that. Eventually I would like to make something better using the inspect module to construct mask-based metric functions from the object-based metric functions, instead of the hacky eval-hack I use now, but this works for now :)

I think this is ready to be merged in, there's just a few documentation notes for the (internal) function _compute_inhibition_nearest_neighbour_distribution that I think are outdated with your new implementation (e.g. we had a num_placements argument that allowed you to repeat the placement routine a few times to not make the iorg you compute too sensitive to a single random placement procedure, whose statistics might not be entirely converged; it is still being explained in the docstring). Should we just clean that up now, so we don't forget about it later?

Ah, I missed this. I'll fix the docstring and merge :) thanks!