IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
882 stars 494 forks source link

10977 Globus Upload optimization - batch file size lookups #11040

Open landreev opened 1 day ago

landreev commented 1 day ago

What this PR does / why we need it:

See the linked issue.

Which issue(s) this PR closes:

Special notes for your reviewer:

The guide entry explains what was done in the PR. My own big concern (aside from a large portion of the code in GlobusServiceBean needing an overall refactoring) is that we have very few tests for the Globus functionality, and I am at a bit of loss how to create such tests. I may go ahead and open a separate issue for that effort.

Suggestions on how to test this:

Globus functionality can be tested on dataverse-internal. There is one NESE storage volume (labeled NESEtapeDemo) that's identical to that in prod., except it is connected to a NESE volume specifically reserved for testing. The instructions for the prod. end users should work exactly the same way on dataverse-internal.

The actual test would involve successfully uploading a large enough number of files in a single batch to trigger the new optimization. This would be 50+ files by default; alternatively, the :GlobusBatchLookupSize setting could be set to some lower number for this test.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?:

Additional documentation:

github-actions[bot] commented 1 day ago

:package: Pushed preview images as

ghcr.io/gdcc/dataverse:10977-globus-filesize-lookup
ghcr.io/gdcc/configbaker:10977-globus-filesize-lookup

:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.