samvera / hyrax

Hyrax is a Ruby on Rails Engine built by the Samvera community. Hyrax provides a foundation for creating many different digital repository applications.
http://hyrax.samvera.org/
Apache License 2.0
184 stars 124 forks source link

Create Hyrax::FileSet resource #4081

Closed elrayle closed 1 year ago

elrayle commented 5 years ago

Descriptive summary

Create a resource class that filesets will extend.

Analysis

Hydra::Works::FileSetBehavior

Hydra::Works::FileSet has multiple includes brought in via Hydra::Works::FileSetBehavior

      include Hydra::PCDM::ObjectBehavior
      include Hydra::Works::ContainedFiles
      include Hydra::Works::Derivatives
      include Hydra::Works::MimeTypes
      include Hydra::Works::VersionedContent

Hydra::PCDM::ObjectBehavior

Defines files relationship. This should be converted to a Valkyrie attribute that is an array of Valkyrie IDs.

Defines methods:

More Info Needed: Need to determine if these methods can become services.

Hydra::Works::ContainedFiles

This defines relationships to files. They can be translated to valkyrie attributes.

FROM TO
original_file original_file_id
thumbnail thumbnail_id
extracted_text extracted_text_id

Hydra::Works::Derivatives

The only method added by this is #create_derivatives.

There is a Hyrax equivalent service with the same method defined in Hyrax::FileSetDerivativesService setup. It is called via delegation in Hyrax::FileSet::Derivatives. This is included in FileSet through Hyrax::FileSetBehavior module which is included in the generated FileSet installed in new apps.

Hydra::Works::FileSetBehavior is included earlier than Hyrax::FileSet::Derivatives. So #create_derivatives defined in Hyrax overrides the one defined in Hydra::Works.

Analysis: It should be ok to drop this include. See Hyrax::FileSet::Derivatives below for more information on derivatives.

Hydra::Works::MimeTypes

Hydra::Works::MimeTypes defines methods like pdf?, video?, etc. I looked at creating a service for this functionality so it can be moved out of the FileSet model. This is adequate for uses of pdf? within Hyrax.

FileSetPresenter delegates these same methods to :solr_document which is passed into the presenter as curation_concern from the controller. These methods are defined on the solr document by including Hydra::Works::MimeTypes in Hyrax::SolrDocumentBehavior.

There is work toward removing the need for Hydra::Works::MimeTypes in...

branch: mimetype_service

Hydra::Works::VersionedContent

TBD

Hyrax::FileSetBehavior

Hyrax::FileSetBehavior brings in a large number of extensions. These are similar to extensions brought into works through Hyrax::WorkBehavior.

    include Hyrax::WithEvents                   # also in works
    include Hydra::Works::FileSetBehavior
    include Hyrax::VirusCheck
    include Hyrax::FileSet::Characterization
    include Hydra::WithDepositor                # also in works
    include Serializers                         # also in works
    include Hyrax::Noid                         # also in works
    include Hyrax::FileSet::Derivatives
    include Permissions                         # also in works
    include Hyrax::FileSet::Indexing
    include Hyrax::FileSet::BelongsToWorks
    include Hyrax::FileSet::Querying
    include HumanReadableType                   # also in works
    include CoreMetadata                        # also in works
    include Hyrax::BasicMetadata
    include Naming                              # also in works
    include Hydra::AccessControls::Embargoable  # also in works
    include GlobalID::Identification            # also in works

Includes in Hyrax::WorkBehavior that are not in Hyrax::FileSetBehavior. These are listed to see how these work specific behaviors were addressed in the creation of the Hyrax::Work resource.

    include Hydra::Works::WorkBehavior
    include HasRepresentative
    include HasRendering
    include WithFileSets
    include InAdminSet
    include NestedWorks
    include Suppressible
    include ProxyDeposit
    include Works::Metadata
    include Hyrax::CollectionNesting

Need to determine which, if any, of the work behavior includes were handled in PR #4070 which implemented Hyrax::Works resource.

Related Work

PR #4070 Implement a Valkyrie-native Hyrax::Work model

no-reply commented 5 years ago

Hydra::PCDM::ObjectBehavior

It seems best to just avoid the non-attributes parts of this unless and until some Hyrax code needs to use them. At that point, the best pattern for getting equivalent behavior can emerge for the given case.

Hydra::Works::ContainedFiles

:+1:

Hydra::Works::Derivatives

:+1:

Hydra::Works::MimeTypes

Hydra::Works::MimeTypes defines methods like pdf?, video?, etc. I looked at creating a service for this functionality so it can be moved out of the FileSet model. This is adequate for uses of pdf? within Hyrax.

An external service seems like a massive improvement in this case.

Hydra::Works::VersionedContent

This seems like the fun one, but I'd hope that getting this done should require no work on the actual FileSet model. I suspect versioning will look a bit like Permissions/ACLs, with a small DSL of its own(?).

Hyrax::FileSetBehavior

    include Hyrax::WithEvents                   # also in works
    include Hydra::Works::FileSetBehavior
    include Hyrax::VirusCheck
    include Hyrax::FileSet::Characterization
    include Hydra::WithDepositor                # also in works
    include Serializers                         # also in works
    include Hyrax::Noid                         # also in works
    include Hyrax::FileSet::Derivatives
    include Permissions                         # also in works
    include Hyrax::FileSet::Indexing
    include Hyrax::FileSet::BelongsToWorks
    include Hyrax::FileSet::Querying
    include HumanReadableType                   # also in works
    include CoreMetadata                        # also in works
    include Hyrax::BasicMetadata
    include Naming                              # also in works
    include Hydra::AccessControls::Embargoable  # also in works
    include GlobalID::Identification            # also in works

Many of these are open questions, but these have clear solutions:

Taking a quick hack at the rest:

no-reply commented 1 year ago

i think this was initially done in a7037611554e0c188f.

i wonder if an appropriate remaining scope for this ticket is to update inline documentation (as rendered at https://www.rubydoc.info/github/samvera/hyrax/Hyrax/FileSet) and maybe get some info about FileSet handling and File attachment into https://github.com/samvera/hyrax/wiki/Hyrax-Valkyrie-Usage-Guide ?

no-reply commented 1 year ago

can this one be marked ready with the proposed documentation scope?

tpendragon commented 1 year ago

Proposed Success Criteria

Create a new ticket to update documentation for Hyrax::FileSet explaining how it's used and document FileSet handling/File Attachment info in the Hyrax Valkyrie Usage Guide.

The goal of the latter documentation is to ensure a Hyrax developer can read it and have an understanding of how FileSets get created and persisted in Valkyrie, including how files are attached to them.

Then close this ticket (sorry, there's so much good stuff here and I think if we don't make a new ticket it means rewriting the whole description/title of this one and I'd hate to lose the history.)