Islandora / documentation

Contains islandora's documentation and main issue queue.
MIT License
104 stars 71 forks source link

Add more granular fields for taxonomy terms #888

Closed dannylamb closed 5 years ago

dannylamb commented 6 years ago

Right now we're using field_tags which is supplied by Drupal core. We need to split our fields and taxonomy vocabularies apart to prevent some bad scenarios and give a better/less confusing UX.

We need a field_model for what an object represents (e.g. Image, Video, Audio) or media's usage (e.g. Preservation Master, Service File). I would even consider applying the external URIs for these terms as rdf:types instead of schema:additionalTypes. That could be done in the alter we have for jsonld.

We'll also need a field_access for terms used with the permissions_by_term module. We should use the field_permissions module to lock down access to the field for non-administrators.

The remaining terms that we just use as flags for the context module need a field_behaviour (or some such... I'm not sure what the best name for this is).

We should also split about the terms into separate vocabs. One vocab for object types, one for media usage, one for access control, and one for the "behaviour" terms. Once those are split apart, we can restrict the fields to their appropriate vocabs and just offer a dropdown instead of an autocomplete.

mjordan commented 6 years ago

@dannylamb when you say "We need a field_model for what an object represents (e.g. Image, Video, Audio) or media's usage (e.g. Preservation Master, Service File)", do you mean we need both? I think it will be very useful to split these out into separate vocabularies.

For the "behavior" tags, are you referring to things like use Openseadragon? Are these always associated with "viewer" or Field Formatter behavior as discussed in #811?

dannylamb commented 6 years ago

@mjordan I was thinking separate vocabularies. One for "what type of digital object this node represents?" and one for "what is the use of this media?". The field, on the other hand, could be shared (not that two separate fields for clarity would be terrible).

For behaviour tags, I am indeed referring to things like Openseadragon, which affect the inner workings of the system but aren't really a subject contained in the source material. This includes, but is certainly not limited to, alternate viewers.

mjordan commented 6 years ago

Cool. So should we be looking for suitable Linked Data vocabularies to use in these... Drupal vocabularies?

dannylamb commented 6 years ago

Yeah totally. I've got a mish-mash of the COAR controlled vocabulary and dcmi in there for Models. And am straight up making up the Openseadragon uri (just pointing to its homepage). Actual librarian input is welcome/badly needed.

If anyone wants to change what's up there, you can issue a PR to change this csv. New vocabularies would require a new feature export as well, but just changing URIs could be done by simply changing the csv.

mjordan commented 6 years ago

I'll add Transcript and ExtractedText (borrowed from PCDM) later today.

rosiel commented 6 years ago

Proposal (feel free to counter-suggest!):

"Islandora Object Type":

"Islandora Media Category":

"Islandora Access Control"

mjordan commented 6 years ago

@rosiel can I suggest "Private (Authenticated users only)" since it's more consistent with Drupal's UI language?

DiegoPino commented 6 years ago

Should access control be managed directly via Drupal via no islandora intervention. The desired Drupal way? I mean Whatever we do it will require overriding Drupal´s access control anyway

rosiel commented 6 years ago

@DiegoPino sure, it can. My suggestion was to "ship with" a default taxonomy and set of rules so that out of the box, Islandora has private and public objects. Since it's all Drupal, you should be able to go in and change it (or remove it) to your repo admin heart's content.

DiegoPino commented 6 years ago

And now i see permissions_by_term module in place.... see? confused as hell.

mjordan commented 6 years ago

@DiegoPino some of this is covered in #823 and #826.

DiegoPino commented 6 years ago

@mjordan totally, thanks. I'm just reading the whole tax. term thing. Will remove my huge comment here, just add confusion to the discussion sorry @rosiel . Also saw the video on the Permissions by term thing and i feel shipping these ones is complicated.

Since terms need to be associated to existing roles/users. When you ship this tax terms users don´t exist yet. So only admin is there and has access to anything by default.

So what really controls the access (and by doing so defines the "label that makes sense for the taxonomy" is the actual drupal role, with its permissions, like editing content, deleting, etc.

I feel if we fix the term to something that is not reflected in the actual permission we will confuse people (some people like me) Other option would be to ship roles, like metadata-editor, media grunt, etc?

Seems like the one term that could be set by default is anonymous one (public) without having to depend on other roles/users to be present

mjordan commented 6 years ago

When you ship this tax terms users don´t exist yet. So only admin is there and has access to anything by default.

Yes, isn't that exactly how Drupal roles work? You add/remove users as needed.

We've been using taxonomies for access control on our D6 institutional repository for over 6 years and no one has found it overly confusing.

DiegoPino commented 6 years ago

@mjordan Good for you and your team, very glad to hear no one had issue with using this, gives me a lot of hope!

@rosiel your feedback here is super appreciated since you have seen this working in production. I do have questions and don't want to take over this thread, promise this is the last post here

I just spend some time playing with this and now agree with all tax. terms rosie++ suggests (sorry was just confused meant no harm), but still have issues with access: Will replicate some steps here using "permissions_by_term" module, please let me know if this is how it is done because i see many ways and i fear i'm not that smart using this.

1) Give a new drupal role named "metadata-manager" access to create/edit permissions for all the content types listed (but not delete) 2) Create a new vocabulary named "Islandora Access Control" 3) Create a new term there named "Private but can't delete" and set permissions for that term so the 'metadata-manager' role can see/use it 4) go to structure/content types, select article, add a new entity reference field, select taxonomy, select new vocabulary "Islandora Access Control", remove auto create option, allow single one term only (because not sure if mixing two tags will result in optimistic or pesimistic enforcement) 5) assign a new user to that role, log in as that role 6) Create new article, set "Private but can't delete" in the new field for this content type. I can see it after save (good!) 7) log out. 8) anonymous does not even know the terms exists nor can see and find the new article (good) 9) Admin goes and says, hey! "metadata-manager" user group should now be able to delete stuff.. so goes to user-> permissions and sets delete permissions for that role.

10) but wait. (because admin == me so confused).. tax term is named "Private but can't delete", so i need to change the term label? to match the new permissions so my users can understand their new power?. yes! And i can (good!) 11) But then admin thinks, mmm, maybe it should be named "only you will be able to edit anything and delete too, but no one else". Is the admin right? or should the admin just be worried about world peace? And let write/edit be managed in a different way? other Vocabulary?

12) So admin goes for "Private, full control for metadata users",

13) now admin adds a new content type. Admin needs to update role permissions right? Or, second option, change the term label so people using this know that using that term will not allow editing/creating/deleting for that newly created content type? That is where i start getting confused. By the many clicks and going backs and forth 😟

Because whole thing is the label and needs to be expressive enough to match what the real permission does and permissions tend to be segmented by CRUD in repositories. So private says too much or too little. So maybe for someone as complicated as i am would prefer to split access v/s edit v/s purge and deal with them as separate permissions because also that user comes from 7.x where that is possible.

So maybe having roles and a full set of tax terms to mimic same combinatory possibilities shipped (because all is just a symlink to the roles or at least i understand it like that) could be helpful?

I remember @rosiel did a nice matrix some time ago with all access combinatories. How can that be mapped to tax terms.

And here is where i drop the ball but thanks for reading. Too complicated for the end of the day!

dmoses commented 6 years ago

I'm not sure I'm going down the right path here, but I would like to suggest the following types for "Islandora Object Type":

dannylamb commented 6 years ago

@dmoses So that's going to be more of our "Agent" content type. From a 7.x MODS this'll pretty much directly translate to a mods name element. One of the outputs of the upcoming sprint is going to be something that very much resembles what you're asking for.

dannylamb commented 6 years ago

@DiegoPino We're not trying to generically solve this for everyone. Just provide a basic vocab and some terms so people have a place to start and don't have to go through the full process like you just ran through. Fine grained access control is going to vary wildly from institution to institution and I don't think we can get away with prescribing too much. We'd just be stepping on people's toes at that point.

Which is a good segway into saying that no matter what consensus decides to ship as a feature, any other modules/means of access control will still be available for use. I'm sure someone will raise their hand for Organic Groups eventually, and we don't want to alienate them.

ajs6f commented 6 years ago

I'm sure someone will raise their hand for Organic Groups eventually

That would be @Smithsonian. Of course we are in no way asking for that now or anytime soon.

DiegoPino commented 6 years ago

@dannylamb sounds good, thanks for clarifying. @ajs6f organic groups are like your bat-signal 👍 ! I agree OG are cool. I'm ok with whatever means less maintenance and more UI/UX consistency

dannylamb commented 6 years ago

@ajs6f You probably won't have to ask. You should be able to just download and enable the module.

ajs6f commented 6 years ago

@dannylamb That makes me feel like :gift_heart:.

rosiel commented 6 years ago

@DiegoPino I (finally, sorry) had the chance to go through your step-by-step. Thank you!!! For actually implementing it! I know we've been saying for a while "We'll use Taxonomy Access Control" but the best-practices for implementation might need some thinking through (and documentation, as you did!)

I wouldn't necessarily use taxonomy terms to define the level of access (like you said, the label can become misleading if it's tied to how much access you have). Here's my user story:

  1. I want to replicate a multisite with two namespaces, upei: and unbsj: . I go into the Islandora Access Control vocabulary and add two terms, 'upei' and 'unbsj'.

  2. I create roles, upei_admin, unbsj_admin, and give them full access to the objects with "their" term.

  3. If I want multi-level access, i could make upei_content_creator, unbsj_user, etc, and give those roles selective access on "their" taxonomy terms.

It might(?) be possible to also have a 'private' or 'public' tag (don't need both, i suppose) that controls whether anonymous (logged out users) can view a thing, orthogonal to the user access controls. Or a "draft" status that can be part of a replacement for Simple Workflow. But that remains to be seen, and would require some additional setup. But we won't have to code much (which is good, right?)

We'll definitely have to see how multiple tags/roles interact, and document it, so that we can create some sensible "use cases" to illustrate, (and that work as intended).

whikloj commented 6 years ago

The discussion on this ticket went all over the place. Fine places, interesting places, but I'm trying to understand the work here and I think it is to add 2 new fields to islandora_demo for model and access?

dannylamb commented 6 years ago

@whikloj That was the original premise, and I think still needs to be done. @rosiel's initial proposal to break up the vocabularies would be good to throw in, too. The vocabulary for media will have overlap with https://github.com/Islandora-CLAW/CLAW/issues/854, and possibly resolve it if you toss in a few more entries for all the 7.x DSIDs.

whikloj commented 6 years ago

Ok, I can take this.

I'm going to add 3 new fields to the default Repository Item content-type.

  1. field_model which will store the "type" of resource.
  2. field_access which will store the ...something... (term to set permissions by)
  3. field_display or field_render to store the way a resource is displayed.

The above will be part of islandora_demo.

Then I will separate the existing taxonomy terms into separate vocabularies. This will be in islandora proper.

dannylamb commented 6 years ago

Then I will separate the existing taxonomy terms into separate vocabularies. This will be in islandora proper.

The config for the vocabularies will go in islandora_core_feature, and you'll want to add/update those terms in the migration csv so they get slurped in on install.

whikloj commented 6 years ago

@mjordan @rosiel Regarding field_display and eventually field_access. Should these terms include an external URI as well? We stuck the OpenSeadragon URL for it, but I'm not sure if that is useful. A URI for the model makes sense, but the rest seems more in the display layer and less related to the preservation. Unless we are thinking about how to generate a CLAW site from a Fedora repository.

mjordan commented 6 years ago

What would those external URIs look like? Do we need to find an ontology(ies) that we want to use?

mjordan commented 6 years ago

@whikloj not for access for sure. If we delegate access control to a front-end system like Drupal (which we're doing), moving to another front end will require an entire rearchitecting of the access control system anyway. So we wouldn't want resource-level access control info in our repository. Plus who is allowed (as defined by the font end's roles/permissions/etc.) to see/do what changes over time. More trouble that it's worth.

whikloj commented 6 years ago

Ok I'm dropping this one and running for the hills of Java and a Fedora sprint next week.

We have hard coded field names into so much of the machinery that I think we need to just stop making it seem like you can just make any content-type into something that works with CLAW.

You need to use field_media_of and field_tags and your Taxonomy terms all must have an external uri (in a field called field_external_uri).

I started to pull this apart but then everything stopped working. By everything I mean the derivative creation/linking.

So, to recap. Easy to use but less flexible, or harder to use and more flexible.

I will say that I am sooooooo frustrated with this issue that I am wishing to pull Drupal's core out of its chest and let its watch its final few processes.

Here is my work up-to failure. Islandora Islandora Demo

dannylamb commented 6 years ago

@whikloj Judging from the diff you're like 90% there. I can suit up and take on the async debugging to get this over the finish line.

And FWIW I try to describe what fields are required in the README: https://github.com/Islandora-CLAW/islandora#content-modeling , but I admittedly don't have the best words.

whikloj commented 6 years ago

@dannylamb I was totally not trying to blame you.

I was first of all super-frustrated and I feel we went from the very restrictive Islandora 1.x and tried to go super open for people. Which is a laudable goal, but causes its own troubles.

I think the text you have is good, but we either need to find programatic ways (like I was trying) to check all fields, or have a way to configure the field you are using, or explicitly state that all media types need a X field and all content types need a Y field and if you don't do that, you will have trouble.

dannylamb commented 6 years ago

@whikloj I managed to sort it out picking up from where you left it. I only changed two things:

  1. I ditched the field_external_uri_single field and just made field_external_uri single valued instead.
  2. I'm very explicitly using field_media_of and field_media_usage when adding/updating a Media for a Node. So we'll need to document that more clearly than what's currently in the README.

Once I did that, everything started working. It's gonna be a bear to test this one, because I had to disable/re-import a couple of features at different times, and had to re-run migrations to make sure the tags go their external uris, etc... I'll see if I can figure out the simplest test instructions possible for this.

PR pending once I double check the status of tests.

seth-shaw-unlv commented 6 years ago

Just tested Islandora-Devops/claw-playbook/pull/79 and it looks good.

The one change I would revert back is the content browser on the Media form. I do like the concept of having the media browser, but the auto-complete field pre-populates with the relevant node when initiated from the Node's media tab while it doesn't with the content browser. This results in more work for the user in the usual case rather than less.

dannylamb commented 6 years ago

@seth-shaw-unlv Yeah, I stumbled head first into that during the demo I gave. prepopulate doesn't play well content_browser. That also means it probably won't work when adding a member either. I'll confirm, and if so, will revert altogether. We'll have to sort out if it's worth having custom code or if we can live without content_browser for those two fields.

Natkeeran commented 6 years ago

@dannylamb Tested this. Mostly works as expected. Some UI improvement may be needed.

Did not test "Access Control". Not sure how to test this.

Adding member of required 4 clicks. If we can somehow specify content type for the content_browser, that might work. Otherwise, the user experience is bit lacking.

We can live without the content browser to populate the member of for the media. Because, user won't have to usually change that.

Identifier is grouped under System. Identifier may be a metadata field.

I assume general field group fields are literal values and related field group fields are uris. Nevertheless, a cataloguer might find the positioning of these elements odd. When describing a book, one would input the title, author, publication information, subject etc. A cataloguer or meta-data librarian may need to review this. Note that this is simply re arranging the display form order, so not a blocker.

dannylamb commented 6 years ago

@Natkeeran @seth-shaw-unlv

I've updated the issue-888 branches of islandora_core_feature and islandora_demo_feature to revert back to autocompletes for field_media_of and field_member_of. I also moved stuff around based on @Natkeeran's feedback and made the fieldsets collabsible (default to open).

If you want to check them out, you should be able to just pull the changes down from github and re-import the features.

I'm hoping these changes will make this more meaningful for folks. I'm making my best guess here, but it's really feedback from librarians that's going to dictate things at the end of the day.

dannylamb commented 6 years ago

Ok, and I made PRs for everything, instead of just relying on the installer to pull in branches.

So, gonna have to do this in a specific order, since I had to resort to hackery to make this testable. Basically, if folks are ready to merge, i'll have to go in and change the composer.json for each feature/module to point back to 8.x-1.x branches instead of issue-888. Then we can merge. Then I'll update https://github.com/Islandora-Devops/claw-playbook/pull/79 to reference 8.x-1.x branches again, and we can merge that.

Natkeeran commented 6 years ago

@dannylamb

Seems like you have changed the Subject field to a simple autocomplete of the Subject taxonomy. Works for now. We can wait for the MIG to provide further guidance. Few other details will also need MIG input.

I am assuming Access Control term do not have any behavioural impact right now.

Separating the tags improves the authoring experience for sure.

We can merge this.

dannylamb commented 6 years ago

@Natkeeran @seth-shaw-unlv All clear for me to update branches in the various composer.json files to point back to 8.x-1.x?

dannylamb commented 6 years ago

@Natkeeran And yes, I've left the Access Control field open for now. I could have put in some dummy tags like "Admin Only" or something, but that can already be done with the normal user/role based permissions. If we have a common scenario that requires more expressive authz controls, we can throw that in later.

Natkeeran commented 6 years ago

@dannylamb I am good to go.

dannylamb commented 6 years ago

@Natkeeran I'm adding Travis to islandora_demo https://github.com/Islandora-CLAW/islandora_demo/pull/4

After that I'll switch the branches back and ping you.

dannylamb commented 6 years ago

@Natkeeran Thanks for hanging tight while I ride this roller coaster. I had to make some changes to get tests to pass, and couldn't resist the urge to clean up something sooner rather than later. I moved the vocab and field storage for the access control taxonomy terms to islandora_demo so that we're completely decoupled from AuthZ schemes. This way someone who wants to use og isn't pinned down by permissions_by_term.

I certainly didn't intend to change functionality, just to clean up dependencies. Travis is green and I'm confident these pulls are good to go, but I'd feel better if someone spun up another box just to confirm. The changes were significant enough to warrant it, let's say. Can I get one last set of eyes on this from an @Islandora-CLAW/committers?

dannylamb commented 6 years ago

FYI I'm testing with https://github.com/Islandora-Devops/claw-playbook/pull/79/commits/28b3e6561f0b9fd94be83d7a7af49052f0688d29

The latest commit on that PR is pointing at 8.x-1.x, which is what we want to merge in. But to test, you need that commit to pull from issue-888 branches. I'll still have to update https://github.com/Islandora-CLAW/islandora_demo/pull/4 one last time once this gets committer approval.