SwissOpenEM / ScopeMArchiver

0 stars 0 forks source link

Run archival/retrieval stress tests for LTS #83

Closed phwissmann closed 1 month ago

phwissmann commented 3 months ago

Description Devise and implement a (stress) test scenarios to understand the following:

Solution proposals

Repeated Retrieval of single dataset

Procedure:

  1. Create a dataset
Setup
number of files 800
size per file 200 MB
dataset size 160 GB
target block size 50 GB

done: pid 11223344

  1. Archive once on LTS test share

  2. Run scheduled retrieval against LTS share|

    • Scheduled every 4h
    • will run from landing zone first, expect to run from tape after 8h+
    • Question: what is the retention time on the landing zone? Schedule needs adaptation

Results

Repeated Archival

Procedure:

  1. Create 30 datasets
Setup
number of files 100
size per file 100 MB
dataset size 10 GB
target block size 50 GB
  1. Archive datasets concurrently
    • Concurrency limit on workpool level (4)

Result

Large dataset

Recommendation by Daniele: one 1-2TB dataset to see any issues

phwissmann commented 3 months ago

Additional info regarding landing zone handling etc:

The files of your test-share are written to tape 24 hours after last access_time, every day at 04:20am in the morning. At this point the "T-flag" is not set yet, as the file still has a copy on the landing_zone ( on disk ). The delete script, which removes the copy from the landing_zone runs at 13:45 every noon. After that the "T-flag" should be set. The time schedules for the copy and delete scripts are different for each share, and can also change, so they should not be hard-coded inside your code.

Please note, that the "T-flag" will be gone again, after copying a file back from tape, as it has 2 copies then, one on tape and one on disk ( landing_zone ). The file will be removed with the next Delete-Job ( Cleanup Job ), and the "T-flag" will show up again then.