usap-dc-dev / usap-dc-website

repository for usap-dc website. Includes javascript client side app and python/flask server side.
2 stars 0 forks source link

Large Dataset Issues (seafloor-ph & prod) #66

Open sicordero opened 1 month ago

sicordero commented 1 month ago

These issues were discovered sequentially when trying to archive large datasets via curator documentation:

(1) Production server script used to gizp and baggit datasets stored on seafloor-ph in /archive/. The script housed in /web/usap-dc/htdocs/bin/ used to bag/tar the datasets ready flagged for archive (or specific datasets) cannot connect to seafloor-ph python3 archiveUSAPDC_batch.py [start ID] [end ID] This throws an error that it cannot locate the local archive

(2) seafloor-ph script archiveUSAPDC_largeDatasets.py not working to bag tar large datasets for archive, unsure why it has this issue -- possibly permissions? nohup python3 scripts/archiveUSAPDC_largeDatasets.py 61780_archived_data true

STARTING PYTHON SCRIPT GET FILE TYPES FOR ALL FILES IN DATASET Traceback (most recent call last)" File "scripts/archiveUSAPDC_largeDatasets.py" , line 247 , in [module] ds_title = res ['title'] if res.get('title') else 'Not Available' AttributeError: 'NoneType' object has no attribute 'get'