uga-libraries / hub-monitoring

Scripts for summarizing and validating content on the Digital Production Hub, the UGA Libraries' centralized storage for digital objects that are not suitable for our digital preservation system.
Creative Commons Attribution Share Alike 4.0 International
1 stars 0 forks source link

Size if accession is not a bag #14

Closed amhanson9 closed 5 months ago

amhanson9 commented 7 months ago

Some accessions are not immediately bagged due to file path length issues that must be resolved first. In that case, get_file_count() and get_size() should calculate the size of the accession folder, rather than the bag data folder.

To not include the preservation metadata, get the path to the folder within the accession folder that does not end with FITS.

amhanson9 commented 7 months ago

@emkaser This is working for the directory structure collection/accession/folder_with_content, where the only folders in accession are folder_with_content and optionally a folder ending "_FITS". Is that correct?

Is the folder_with_the content always named with the accession number? If so, there is an easier way to formulate the path.

Or might there be times when the content is in collection/accession? If so, I can use collection/accession if there isn't a bag or non-FITS folder in accession.

emkaser commented 6 months ago

@emkaser This is working for the directory structure collection/accession/folder_with_content, where the only folders in accession are folder_with_content and optionally a folder ending "_FITS". Is that correct?

Is the folder_with_the content always named with the accession number? If so, there is an easier way to formulate the path.

Or might there be times when the content is in collection/accession? If so, I can use collection/accession if there isn't a bag or non-FITS folder in accession.

1) Yes, that should be correct that the only folders inside the accession folder are content and FITS.

2) Yes, the folder_with_the content is always named with the accession number.

3) No, there should not be times when content is in collection/accession (this would mean content was mixing with preservation documentation like the manifest/preservation log/etc.)