ratt-ru / QuartiCal

CubiCal, but with greater power.
MIT License
8 stars 4 forks source link

Backup and restore apps may fail when moving between formats #287

Open JSKenyon opened 1 year ago

JSKenyon commented 1 year ago

Describe the bug See title. Flags backed up from a zarr ms cannot be restored to a conventional ms - they end up as appended rows.

To Reproduce Backup flags from a zarr ms and attempt to restore them to a conventional ms.

Expected behavior The above should work as expected (provided ordering and selection are consistent).

Version main

landmanbester commented 1 year ago

I was just bitten by this again. It seems that it's not just the column that get's restored that is affected, all the columns seem to have additional rows in the final scan. I guess ROWID gets appended to. Any ideas on how to recover from this? cc @sjperkins

landmanbester commented 1 year ago

Trying to access data in the affected scan with

In [99]: xds[-1].DATA.values.shape

results in a long error culminating in

RuntimeError: IPosition::getFirst(n); n is too high
sjperkins commented 1 year ago

WIthout ROWID present, dask-ms will append https://dask-ms.readthedocs.io/en/latest/tutorial/writes.html#updating-appending-rows.

Its not immediately obvious to me what the correct thing to do here is. ROWID is needed to map the zarr format back to the original ms, but it is discarded during conversion to zarr. Possibly we could retain the ROWID column when writing to zarr.

landmanbester commented 1 year ago

Yeah just keeping ROWID makes the most sense to me. I have nuked the old MS, splitting again. We should probably prioritize this as it can corrupt data

JSKenyon commented 1 year ago

Ah, yeah. I had forgotten about this one. This may be another point to add to the desired features list on dask-ms. If we retained ROWID on the zarr datasets, we would have substantially more power in xarray-land too, as it would be possible to resolve selections i.e. read from zarr, manipulate with xarray and write back to ms. It would be expensive in the worst-case scenario, but probably worth it overall.

JSKenyon commented 1 year ago

Any thoughts on how difficult this would be to change @sjperkins?