dandi / dandi-archive

DANDI API server and Web app
https://dandiarchive.org
13 stars 12 forks source link

Draft version timestamps not updated when embargoed assets were moved #2002

Closed jwodder closed 4 weeks ago

jwodder commented 2 months ago

It appears that, on or around April 29, embargoed asset blobs were moved from the dandiarchive-embargo bucket to the dandiarchive bucket. This was accompanied by updates to the assets' contentUrl metadata, and the assets' modified properties were updated at that time, but the modified properties of the respective Dandisets' draft versions were not updated, and backups2datalad does not like this.

For a specific example, the embargoed Dandiset 000770 contains an asset sub-Elgar/sub-Elgar_ses-2022-06-04_ecephys.nwb with a modified property of 2024-04-29T19:35:05.558024Z, yet the modified property for 000770/draft is still 2024-04-09T19:15:02.126137Z.

jjnesbitt commented 2 months ago

So it seems this is mainly a retroactive issue? That is, we should update these respective versions to reflect the last modified time of those assets, but it's not a systematic issue going forward? Seems like it's just a side effect of the embargo re-design data migration.

jwodder commented 2 months ago

@jjnesbitt Yes, that sounds correct.

jjnesbitt commented 1 month ago

@jwodder I believe I've addressed the issue. Can you verify this?

jjnesbitt commented 1 month ago

@yarikoptic ping

yarikoptic commented 1 month ago

Thank you @jjnesbitt ! FTR: I have triggered rerun of flock -E 0 -e -n /home/dandi/.run/backup2datalad-cron-nonzarr.lock bash -c '/mnt/backup/dandi/dandisets/tools/backups2datalad-update-cron --mode verify so would have an answer (hopefully) when it completes.

yarikoptic commented 1 month ago

@jjnesbitt do you recall any details on what you have done since no detail were provided? I posted the summary of current situation to https://github.com/dandi/backups2datalad/issues/49#issuecomment-2402911554 and it seems that 000874 also has odd modified date which is before when updated assets were uploaded -- not sure if it is exactly this issue or something else, please have a look if you just went for a single initially reported dandiset.

jjnesbitt commented 4 weeks ago

@jjnesbitt do you recall any details on what you have done since no detail were provided?

I did what I said I would in this comment, which is, updated all of the relevant draft versions to reflect the last modified time of their modified assets. However, I only did so for assets with a last modified time of April 29, as that was the day of the data migration that John filed this issue about, and I don't want to accidentally modify data incorrectly.

Since this issue is specifically about assets related to the embargo data migration, I'd like to close this issue as resolved. If there are more instances of this occurring, we can investigate them separately.

I posted the summary of current situation to https://github.com/dandi/backups2datalad/issues/49#issuecomment-2402911554 and it seems that 000874 also has odd modified date which is before when updated assets were uploaded

I'm not sure the extent of affected dandisets, as it's fairly hard to make sense of that comment with all of the data. Is it just that dandiset? Do you know how many assets are affected?

yarikoptic commented 4 weeks ago

Quick one: that type of error indeed was only in one dandiset. It was under a dozen assets. I will later return to get a full list of that would help.

jjnesbitt commented 4 weeks ago

In that case I think we can close this issue. If/when you have that data for the other dandiset, it could be filed as a separate issue.