samvera / hyrax

Hyrax is a Ruby on Rails Engine built by the Samvera community. Hyrax provides a foundation for creating many different digital repository applications.
http://hyrax.samvera.org/
Apache License 2.0
184 stars 124 forks source link

Wings: Round trip conversion of files does not maintain file relationships #4060

Closed elrayle closed 4 years ago

elrayle commented 4 years ago

Descriptive summary

When there is a round trip conversion of an ActiveFedora based FileSet to a Valkyrie resource and back to an ActiveFedora based FileSet, files are not correctly converted.

Rationale

Round trip conversion should produce the same object.

Initial ActiveFedora based FileSet

Expected and Actual behavior for FileSet

NOTE: IDs are simplified to make this easier to read. Typical file IDs and URIs look like...

id  # "191dc5d4-66ea-4897-ab25-11fe7340bf8a/files/01ccf5d8-c136-48c5-af16-ecc513ca69f2"
uri # "http://127.0.0.1:8986/rest/test/19/1d/c5/d4/191dc5d4-66ea-4897-ab25-11fe7340bf8a/files/01ccf5d8-c136-48c5-af16-ecc513ca69f2" 

Results of method calls on fileset1

fileset1.class # FileSet
fileset1.id # abc12345
fileset1.original_file # <Hydra::PCDM::File uri="http://127.0.0.1:8986/rest/test/ab/c1/23/45/abc12345/files/xyz67890">
fileset1.filter_files_by_type(RDF::URI.new('http://pcdm.org/use#OriginalFile')) # [<Hydra::PCDM::File uri="http://127.0.0.1:8986/rest/test/ab/c1/23/45/abc12345/files/xyz67890">]
fileset1.original_file.id # "ab/c1/23/45/abc12345/files/xyz67890"
fileset1.original_file.uri # "http://127.0.0.1:8986/rest/test/ab/c1/23/45/abc12345/files/xyz67890"
fileset1.files
[#<Hydra::PCDM::File uri="http://127.0.0.1:8986/rest/test/ab/c1/23/45/abc12345/files/xyz67890" >]

Accessing the file using the filter_files_by_type and the convenience method original_files returns the same file with the same URI and the same id. The id is a shortened version of the URI with the Fedora PROTOCOL removed.

NOTE: The FileSet does not respond to the following methods.

fileset1.original_file_id  # NoMethodError
fileset1.original_file_ids # NoMethodError
fileset1.file_ids # NoMethodError

Behavior when converting to a Resource

Expected behavior

resource.class # <Class:0x00007fedc24726a8>
resource.original_file_ids # [<Valkyrie::ID id="ab/c1/23/45/abc12345/files/xyz67890">]
resource.original_file_ids.first.to_s == fileset1.original_file.id # true
resource.file_ids # [<Valkyrie::ID id="ab/c1/23/45/abc12345/files/xyz67890">]
resource.original_file # NoMethodError

Actual behavior

resource.class # <Class:0x00007fedc24726a8>
resource.original_file_ids # [<Valkyrie::ID id="ab/c1/23/45/abc12345/files/xyz67890">]
resource.original_file_ids.first.to_s == fileset1.original_file.id # true
resource.file_ids # [<Valkyrie::ID id="ab/c1/23/45/abc12345/files/xyz67890">]
resource.original_file # <Hydra::PCDM::File uri="http://127.0.0.1:8986/rest/test/ab/c1/23/45/abc12345/files/DIFFERENT_ID">

The resource.original_file's URI is different from fileset1.original_file's URI. For the valkyrization process, it is probably undesirable for the original_file method to exist on the resource. The expected behavior shows this method raising NoMethodError.

It is created as part of the property-to-attribute translation process in the ModelTransformer.

Behavior when converting back to an ActiveFedora based FileSet

Expected behavior

rt_fileset.class # FileSet
rt_fileset.id # abc12345
rt_fileset.original_file # <Hydra::PCDM::file uri="http://127.0.0.1:8986/rest/test/ab/c1/23/45/abc12345/files/xyz67890">
rt_fileset.filter_files_by_type(RDF::URI.new('http://pcdm.org/use#OriginalFile')) # [<Hydra::PCDM::File uri="http://127.0.0.1:8986/rest/test/ab/c1/23/45/abc12345/files/xyz67890">]
rt_fileset.original_file.id # "ab/c1/23/45/abc12345/files/xyz67890"
rt_fileset.original_file.uri # "http://127.0.0.1:8986/rest/test/ab/c1/23/45/abc12345/files/xyz67890"
rt_fileset.files # [#<Hydra::PCDM::File uri="http://127.0.0.1:8986/rest/test/ab/c1/23/45/abc12345/files/xyz67890" >]
rt_fileset.original_file.id == fileset1.original_file.id # true

Actual behavior

rt_fileset.class # FileSet
rt_fileset.id # abc12345
rt_fileset.original_file # nil
rt_fileset.filter_files_by_type(RDF::URI.new('http://pcdm.org/use#OriginalFile')) # nil
rt_fileset.original_file.id # NoMethodError (because original_file is nil)
rt_fileset.original_file.uri # NoMethodError (because original_file is nil)
rt_fileset.files # nil
rt_fileset.original_file.id == fileset1.original_file.id # NoMethodError (because rt_fileset.original_file is nil)

Steps to reproduce the behavior

fileset1 = FileSet.new
binary = StringIO.new("hey")
Hydra::Works::AddFileToFileSet.call(fileset1, binary, :original_file)
resource = fileset1.valkyrie_resource
rt_fileset = ActiveFedoraConverter.new(resource: resource).convert

Related work

PR #4055 Convert file ids in resource fileset to pcdm files in AF fileset

elrayle commented 4 years ago

Fixed by #4055 - closing