Closed djjuhasz closed 2 years ago
@djjuhasz Based on the paths expected in the tests, it looks like we don't always want just the basename for files in an AIP, but the full relative path from the root directory (which we're currently assuming is always %transferDirectory%objects
, but may actually be %transferDirectory%data
if I'm understanding correctly).
It might be worth looking at os.path.relpath
or check explicitly for the full strings we want to remove and if either is found, splice them from the beginning of the string.
@tw4l I've updated the code to return the relative paths from the root directory.
@tw4l third time is the charm? 😆
@tw4l nevermind, the unit tests failed in Python 2.7 😢
@djjuhasz It looks likeunittest.TestCase.assertLogs
is new in Python 3.4, which would explain the failing test for 2.7: https://docs.python.org/3/library/unittest.html?highlight=unittest#unittest.TestCase.assertLogs. You could change the test case to verify that the function returns None
for an invalid path?
@tw4l four to go? 😜
@djjuhasz LGTM!
No need to change this, but looking at the tests again it occurred to me that you might be interested in pytest's parametrize functionality.
It enables combining tests like these and testing different combinations of inputs and outputs, so that these three tests:
def test_get_original_relpath_objects_dir(self):
path = "%transferDirectory%objects/folder1/file5.txt"
assert create_dip.get_original_relpath(path) == "folder1/file5.txt"
def test_get_original_relpath_data_dir(self):
path = "%transferDirectory%data/folder1/file5.txt"
assert create_dip.get_original_relpath(path) == "folder1/file5.txt"
def test_get_original_relpath_warn_invalid_prefix(self):
path = "%transferDirectory%datas/folder1/file5.txt"
assert create_dip.get_original_relpath(path) is None
can become, e.g.:
import pytest
...
@pytest.mark.parametrize(
"path,expected_return",
[
# %transferDirectory%objects prefix
(
"%transferDirectory%objects/folder1/file5.txt",
"folder1/file5.txt"
),
# %transferDirectory%data prefix
(
"%transferDirectory%data/folder1/file5.txt",
"folder1/file5.txt"
),
# Invalid prefix
(
"%transferDirectory%datas/folder1/file5.txt",
None
),
],
)
def test_get_original_relpath(path, expected_return):
assert create_dip.get_original_relpath(path) == expected_return
2ee73fa9cc43ab0b6d6395bc4aa8173215935c36 and e5a9f8077dfa19c87897d2f41610636d4faaf9b8
Use
os.path.basename()
to get file names from the<premis:originalName>
path.The previously assumed path of
%transferDirectory%objects
is not valid for BagIt transfers, which lead to truncating the first three characters of the filename in the DIP objects zip file.