Closed yarikoptic closed 1 year ago
@yarikoptic At the end of backing up a Zarr, data for the Zarr asset is requested again in order to check whether the asset's modified
timestamp changed during the backup. That's the request that failed here.
I see -- thanks for the explanation! So the 2nd aspect (getting metadata right away) is not pertinent. But as for the first one on ensuring that for that particular asset it didn't change -- I think we should just use that /assets/
not bound to dandiset endpoint since we are aiming to reach the state of dandiset as it was with those assets (even if changed since then). Agree?
Current rerun while dealing with #293 on 000108 errored out with
which is "legit error" in that the asset likely has been replaced with another one (see first
PUT
below):so indeed if dandiset has changes in the assets from the moment it initially got the list to the moment it decided to query more information about it, that asset might no longer be associated with dandiset, and thus 404. BUT information about the asset (not possibly "loose" and subject to eventual GC) would still be available from generic endpoint, so I think we should use that one instead here
I could be wrong though (I didn't check if both endpoints return identical records, but I assume so).
But related aspect/question -- why do we have this delayed dedicated per-asset query??? if to get metadata for the asset whenever listing was done without getting metadata -- dandisets_version_asset_list endpoint now has
metadata
parameter so we could get all desired metadata (if that is what this call for) during listing of assets for the dandiset, and this way getting a better chance to get a consistently listing of assets with their metadata. So the solution might be two tiers -- use non-bound to dandiset endpoint to just guarantee robustness in possibly other code paths, and then switch to get metadata while getting a list of all assets and thus avoid doing this per asset querying.