Core package to analyze gravitational-wave data, find signals, and study their parameters. This package was used in the first direct detection of gravitational waves (GW150914), and is used in the ongoing analysis of LIGO/Virgo data.
Currently the resolve_url function uses os.stat to check whether files are already in the resolved location or not.
However the check uses os.stat, which does not work for copies of files, as the CAM times will be different
This uses a hash of the files instead, cut down to the first 1e7 bytes by default, to compare the files instead. This means we are more able to meet the conditions required to invoke the no-op.
The cut to use the first 1e7 bytes is because that seems to be safe rather than loading entire files of e.g. HDF_TRIGGER_MERGE files.
Standard information about the request
This is a bug fix to an efficiency saving
This change affects all code areas which use workflow generation
The no-op shortcut which had been intended when the local file already exists in the resolve_url function was not being used, this allows it to be implemented.
Contents
Change the os.stat() call to check if files are the same to use a reduced hash of the files
Testing performed
upload prep minifollowup creation script run when file has already been copied to the cwd - this did not attempt to copy the file
[x] The author of this pull request confirms they will adhere to the code of conduct
Currently the resolve_url function uses os.stat to check whether files are already in the resolved location or not.
However the check uses os.stat, which does not work for copies of files, as the CAM times will be different
This uses a hash of the files instead, cut down to the first 1e7 bytes by default, to compare the files instead. This means we are more able to meet the conditions required to invoke the no-op.
The cut to use the first 1e7 bytes is because that seems to be safe rather than loading entire files of e.g. HDF_TRIGGER_MERGE files.
Standard information about the request
This is a bug fix to an efficiency saving This change affects all code areas which use workflow generation
This change follows style guidelines (See e.g. PEP8), has been proposed using the contribution guidelines
Motivation
The no-op shortcut which had been intended when the local file already exists in the
resolve_url
function was not being used, this allows it to be implemented.Contents
Change the os.stat() call to check if files are the same to use a reduced hash of the files
Testing performed
upload prep minifollowup creation script run when file has already been copied to the cwd - this did not attempt to copy the file