Closed yarikoptic closed 1 year ago
Sentry issue: DANDI-API-C7
This seems to be a bug in dandischema, where the code assumes there is always an encodingFormat
key. Can we move this issue to that repo and fix it at the source?
I think so, transferred...
we can do a validity check in dandischema itself, but i think the original general idea was that the input to that function should be valid assets and valid assets should always have an encodingFormat
. and since validity is already included in the database, the aggregate function should run over valid assets (with invalid assets reported in the validation output).
on a fix side, every asset should have an encodingFormat
, we won't drop that. however, we can add a validate
option to the function.
we can do a validity check in dandischema itself, but i think the original general idea was that the input to that function should be valid assets and valid assets should always have an
encodingFormat
. and since validity is already included in the database, the aggregate function should run over valid assets (with invalid assets reported in the validation output).
In this case, I think this is a bug in https://github.com/dandi/dandi-archive. We're currently aggregating assets regardless of whether or not they're valid. I'd propose we close #184 since it's revealing a true bug in our usage, fix the aggregation on the dandi-archive side, and then recompute summaries for all draft versions.
@danlamanna - that sounds good to me. mostly because the metadata is not valid till the archive completes checksum checks and also does not enforce valid schema on POST/PUT
.
There was https://github.com/dandi/dandi-archive/pull/1490 which made such issues to bubble up, this is a list of sentry reports which seems keep mentioning this issue
which IMHO should be given priority to be addressed even if just not to pollute our mailboxes with alerts, thus possibly hiding other, may be more important issues.