Closed aclayton555 closed 2 years ago
Is there progress on finding a route/person to work on this one @milen-sage @aclayton555 ? Sounds like the problem has been identified and hopefully no data files lost - but even an outward appearance of missing data can (justifiably) be concerning to people
I'm back from leave and will move the Vanderbilt files that I can identify back as a first fix. We can then more rigorously identify any other files that were archived and make sure this does not happen in subsequent moves of projects clinical and biospecimen records to tables.
I have a notebook setup to search manifests for out of place files and move them back. Unfortunately it looks like the entity archive project was setup with only @mialy-defelice as admin with @milen-sage and I only having download and not move privileges.
Unfortunately the file reinstatement will have to wait until @mialy-defelice is back from leave on Monday 29th
Thanks @adamjtaylor !
Reaching out to the Synapse team to check how we can add me and @adamjtaylor as admins on HTAN Entity Archive project.
@mialy-defelice changed access permissions on the project - we are good to go. Thanks @mialy-defelice!!
Thanks @mialy-defelice! I will run the scripts to check for and move missing files later this morning. @vthorsson I will let you know when complete.
Thanks @mialy-defelice and @milen-sage . I am still not seeing the files though.
Ah thanks @adamjtaylor for doing the additional step needed
Confirmed that I can move back in place. Functions used archived in this Gist
I believe it is only Vanderbilt that is effected by this issue.
I will update this comment as Vanderbilt entities are moved back into place [Now complete]
Thanks @adamjtaylor !
Thanks @vthorsson. This is now complete for all of Vanderbilt's manifests that relate to files rather than records.
I'll close this issue - @milen-sage if you could add a comment as to where the fix in the transfer script is being tracked that would be great.
Fix tracked here - we are working on it this sprint.
Describe the bug Filing this ticket among multiple outages (Adam, Mialy, Milen). This stems from discussion on the dcc_operations HTAN Slack channel following this time: https://htanworkspace.slack.com/archives/C03E345AMLK/p1661186303347379 In summary, it looks like quite a few of the Vanderbilt center data files are missing (see example of this empty folder, however, it is not clear if this is the full extent of files "missing." @clarisse-lau spot checked a few single cell files and they appear to still be in the bucket e.g. s3://htan-dcc-vanderbilt/3398375/c207257b-c84a-49f8-ac2b-2c08e7acb2ee/3247-AS-3-ACAGTG_S3_R1_001.fastq.gz. Of note, the error message says 'no access' rather than something like ‘doesn’t exist.' Looking into the issue further, @adamjtaylor determined that It looks like the "missing" files have been swept up into the HTAN Entity Archive project that was meant to be the destination for empty folder entities used to store clinical and biospecimen information in our transfer to tables.
Action Required The entiityIDs and pointers to the s3 bucket have not changed so the id’s in the manifest just need to be moved into the right synapse folder. Can FAIR please help with re-instating these? Looks like @mialy-defelice performed the most recent modifications to the manifest, but please triage accordingly given Mialy's outage.
Priority (select one)