Check columns on loading DataFrames

When loading dataframes from a saved zarr file, we need to check for parity between columns in the file versus columns we expect.

There are basically two cases to check parity between what we load and what we expect:

1) Column is in loaded but not in expected. We should ignore the column. This happend when something was saved but is no longer used. 2) Column is in expected but not in loaded. We need to ensure the column is created and mark each row as needing an update. We then need to compute all row values on load.

A good example is in class spine where I changed angle to spineAngle like this:

    @calculated(title="Spine Angle", dependencies=["point"])
    def spineAngle(frame: LazyGeoFrame):

On next load

1) column spine is in loaded but not in expected 2) column spineAngle is in expected but not in loaded

My previous issue 11 was about adding a file version. That is still useful for larger changes to our file structure.

I would vote we not increment the file version when we add/remove columns or change their names?

mapmanager / MapManagerCore

Check columns on loading DataFrames #23

Related