Result of fill method depends on extent of raw data

This is a follow-up to issue #150. Since the boundary has an imprint on the filled land values, the result of the fill method depends on the extent of the raw data.

Here is an example. Imagine we have our raw CESM data available for the North Atlantic (left image), and fill alkalinity in over land (right image): Screenshot 2024-09-26 at 3 46 14 PM If we instead had only the data available that is marked by the black box in the previous figure, our fill would look like this: Screenshot 2024-09-26 at 3 48 24 PM The difference of the two right panels (restricted to the smaller domain in the second figure) is shown here: Screenshot 2024-09-26 at 3 49 06 PM

The fact that the extent of the raw data has an effect on the ROMS input files is relevant in two contexts:

Currently, ROMS-Tools allows the user to only provide regional raw input data as long as it covers the ROMS domain. If two users have different extents of the raw data (but both cover the ROMS target domain), the results may not coincide.
Before doing the filling step, ROMS-Tools actually chooses a subdomain of the raw input data that covers the ROMS domain + a small margin, which ensures better efficiency because filling and regridding smaller domains is faster. https://github.com/CWorthy-ocean/roms-tools/blob/0c88ff4fe526abd461ebd7800a2a392bddbb1634/roms_tools/setup/datasets.py#L579-L607

We have to think carefully how small or big we want this margin to be because

a smaller margin leads to more efficient computations
a larger margin provides more accurate filling results because we stay away from the boundaries

Currently, the margin is set in degrees lat/lon (and I think we use 2 degrees), which is very suboptimal - we probably want to define this in terms of number of grid points (because this is what the fill method cares about).

Also note that once we have settled on a perfect "margin", we could require the users to provide raw data that includes the target ROMS domain + the perfect margin, and they could get an error message otherwise. This way we at least guarantee reproducibility among users, and we could eliminate problem 1.

CWorthy-ocean / roms-tools

Result of fill method depends on extent of raw data #153