kitodo / kitodo-production

Kitodo.Production is a workflow management tool for mass digitization and is part of the Kitodo Digital Library Suite.
http://www.kitodo.org/software/kitodoproduction/
GNU General Public License v3.0
60 stars 65 forks source link

Subordinate processes cannot be linked to superordinate process in another project #5626

Closed andre-hohmann closed 1 week ago

andre-hohmann commented 1 year ago

Describe the bug If a subordinate process (for example: volume) is created and the superordinate process exists, but in another project, the two processes are not linked during the import. In spite, a second superordinate is created, it the import depths is 2. If the process is linked manually, the following message occurs:

To Reproduce Steps to reproduce the behavior:

  1. Search an existing process of a volume and copy the Identifier - for example the PPN
  2. Import the metadata of the process in a different project then the one of the superordinate process - import depths is 2
  3. See the message with the number of imported processes - it will show: 2 processes retrieved from K10Plus-SLUB-PICA
  4. Try to link the subordinate process with the existing superordinate process - the message The search completed, but nothing was found. You can only find processes that have the same project and rule set. is shown

Expected behavior It should be possible to link subordinate processes of different projects with the same superordinate process, regardless of its project affiliation. This has worked in earlier versions.

Screenshots FehlerUeberordnung01

Release KITODO.PRODUCTION Version 3.6.0-SNAPSHOT

Desktop (please complete the following information):

BartChris commented 1 year ago

I am wondering wether processes of the two projects must have the same ruleset in order for that to work. @matthias-ronge made a comment on that here: https://github.com/kitodo/kitodo-production/issues/4941#issuecomment-1025616071.

Or more precise: The two processes probably still have to share the same process template (which defines the ruleset), even if relations to processes in other projects are allowed. So your issue is about being less strict on the project association, but still keep the ruleset check in place, right?

andre-hohmann commented 1 year ago

Yes, the ruleset must be the same, but relations between processes of different projects with the same ruleset should be possible.

In my examples, the projects have even the same process template "Standard". It seems to me that in https://github.com/kitodo/kitodo-production/issues/4941#issuecomment-1025616071 the creation of a process via the parent process (periodical, newspaper, ...) is described. I want to create processes for volumes of multivolume works, which can be imported from K10plus. But in the end, the rules should be the same.

I can find migrated processes of volumes, which belong to different projects (with the same ruleset), but are linked to the same superordinate process. I am also quite sure, that this was possible in earlier versions. We are working with the current master and maybe there is a new mistake. Edit: I was just informed, that we do not use the current master, but the one from 2023-03-01.

BartChris commented 1 year ago

This would effect two constellations, as far as i can see. 1) Manual linking to a parent process in the linking tab; we could remove the filter for the project here:

https://github.com/kitodo/kitodo-production/blob/766c5eb043fbe804f3bf2d6c1c7ee1869978c6f7/Kitodo/src/main/java/org/kitodo/production/services/data/ProcessService.java#L800

the comment there does not describe the current implementation because only processes in the same project can be linked:

"Searches for linkable processes based on user input. A process can be linked if it has the same rule set, belongs to the same client, and the topmost element of the logical outline below the selected parent element is an allowed child. For the latter, the data file must be read at the moment. This will be aborted after a timeout so that the user gets an answer (which may be incomplete) in finite time"

What is described in the comment is already implemented when searching for children of a parent process. When searching for child processes the project is not taken into account:

https://github.com/kitodo/kitodo-production/blob/766c5eb043fbe804f3bf2d6c1c7ee1869978c6f7/Kitodo/src/main/java/org/kitodo/production/services/data/ProcessService.java#L763

2) Automatic loading of the parent process during the import

https://github.com/kitodo/kitodo-production/blob/99234936ff5e73a45b4de907f2c3c8f3c4382416/Kitodo/src/main/java/org/kitodo/production/services/data/ImportService.java#L938

We could also get rid of the project filter there, too.

The question is: what happens, if there a multiple parent processes (with the same identifier in the source metadata system) in different projects. Given the way the automatic linking system during import is implemented right now, there is no way to control, which of those parent processes the child processes would be linked to, when the same-project criteria is removed. Maybe that is also the reason for the current behavior. I am also not sure what the effect would be if a process is linked to a parent process which is in a project to which a specific user (who has access to the project of the child process) does not have access.

Maybe we can implement a mechanism which follows this rule: a match in the current project is given preference. So if there is already a parent process in the current project, the child process is linked to that. If not it can also be linked to one parent process in another project. What do you think?

andre-hohmann commented 1 year ago

@BartChris : Thanks a lot for your investigation!

The behavior should generally be discussed in a larger group. Otherwise, I expect constant adjustments, each of which causes irritation. Before further mechanisms are implemented, it should be checked, which use cases needs to be regarded and which use cases aim at parent processes for each project. If multiple parent processes (with the same identifier of the source metadata system) are desired, it should be checked, whether templates with different rulesets can be created. In that case, new parent processes would be created.

In my opinion, it is not necessary to create several parent processes (with the same identifier of the source metadata system) in different projects. On the contrary, it is usually desired to "merge" the respective volumes in the presentation. Furthermore: In our presentation, the processes with the same id would be overwritten, if the URL, prefixes, ... are not adjusted. In the SLUB Dresden, we will have severe problems, if parent processes must be created for each project. Thus i would vote for removing the project filter.

I also want to point out that migrated processes from multiple projects are linked to one parent process. From my point of view a uniform procedure should be applied.

andre-hohmann commented 1 year ago

@BartChris : Thanks again for your feedback.

i have created a discussion for this issue and asked via the mailing list for feedback:

Unfortunately, there is no discussion and i ask myself, if the issue is relevant for the users of Kitodo.Production.

The question is: what happens, if there a multiple parent processes (with the same identifier in the source metadata system) in different projects. Given the way the automatic linking system during import is implemented right now, there is no way to control, which of those parent processes the child processes would be linked to, when the same-project criteria is removed. Maybe that is also the reason for the current behavior.

In my opinion, it would be possible to search in the tab "Title record link" via "Search process title" for all parent processes with the same process title/identifier - if there are several ones.

I am also not sure what the effect would be if a process is linked to a parent process which is in a project to which a specific user (who has access to the project of the child process) does not have access.

As discussed in #5635 i think, this could be resolved by the organisation of workflows and project management/admissions.

Maybe we can implement a mechanism which follows this rule: a match in the current project is given preference. So if there is already a parent process in the current project, the child process is linked to that. If not it can also be linked to one parent process in another project. What do you think?

That might be difficult in detail. What happens, if only one parent process is available? Should it be offered to create a new one? I am afraid, that this becomes quite complicated once you start to create the concept - and i am still not sure, if this is really needed.

@solth : Would you accept the change if the filter for project-id (https://github.com/kitodo/kitodo-production/issues/5626#issuecomment-1511672980) is removed? If not, we would create an SLUB-internal solution if necessary. I don't know how we can come to a conclusion otherwise.

BartChris commented 1 year ago

@andre-hohmann Thanks a lot for your thoughts on that. I agree with your assessments. I have two further additions to make. The constellation described by @matthias-ronge in the discussion https://github.com/kitodo/kitodo-production/discussions/5635#discussioncomment-5720268 will probably not appear at least not while creating the link between child and parent, should in my opinion be addressed (as you suggested as well) by demanding that the user has to have access to all projects to whose processes he or she wants to create links.

In Fact: this is how the system is working right now. My description above was not complete. Right now the system does not only check, whether a possible parent process is in the CURRENT project of the child process. Two other things are also taken always into account

  1. Every search query for a parent process automatically gets augmented by the condition, that the process is inside a project to which a user has access.
  2. Every search query for a parent process automatically gets augmented by the condition, that the process is inside a client is the current client the user is logged in. This makes linking to processes of other clients impossible.

This happens here: https://github.com/kitodo/kitodo-production/blob/9826e41483c3ebb398925d9de8fe07fa07af2f95/Kitodo/src/main/java/org/kitodo/production/services/data/base/ProjectSearchService.java#L77 and here https://github.com/kitodo/kitodo-production/blob/9826e41483c3ebb398925d9de8fe07fa07af2f95/Kitodo/src/main/java/org/kitodo/production/services/data/base/ClientSearchService.java#L128

So every search request automatically requires the parent process to be in the current user's client AND list of projects of that user.

So i think it is impossible right now that a user can link to a process to which he or she has no access. So i would say, that we should remove the condition that the parent process is in the CURRENT project. But we can keep the condition that the user has to have access to the project, where the process resides and limit the search to the current client, That should address the problem with the permissions: If a user cannot link to a parent process, he needs to get access to the project where the parent process is located. Links can only be created to processes on the same client (the one which the user is logged in).

BartChris commented 1 year ago

Maybe we can implement a mechanism which follows this rule: a match in the current project is given preference. So if there is already a parent process in the current project, the child process is linked to that. If not it can also be linked to one parent process in another project. What do you think?

That might be difficult in detail. What happens, if only one parent process is available? Should it be offered to create a new one? I am afraid, that this becomes quite complicated once you start to create the concept - and i am still not sure, if this is really needed.

I think i can go with the following solution: The child process gets linked to whatever parent it finds (taken the above considerations into account). It might link to the wrong project. If that happens the user manually has to correct a wrong association then. That should always be possible.

Edit: To further elaborate on my proposal. No i would not create new processes. Whenever a process is found, the system would automatically link to that. The only rule would be that processes in the same project have priority. Just to have a clear rule in the scenario of having multiple possible parent processes and not just a random choice. But if we implement such a rule or not: I think it is always the responsibility of the institution to keep order here, so i would say we can also let the system pick the first parent process it finds and automatically link to that.

BartChris commented 1 year ago

There is probably one caveat. The way the search queries are constructed right now, it is probably the case, that a user without permissions to the project of a parent cannot establish a relation to that parent process, because the system will not find it. So the question would be, if we should, for the purpose of the linking to parents, enable the user to also search through processes from projects to which he does not have access. Just to construct the link, editing the parent would be a different thing. (for the later i agree with your points made here: https://github.com/kitodo/kitodo-production/discussions/5635#discussioncomment-5727442)

andre-hohmann commented 1 year ago

@BartChris : Thanks a lot for the investigation! I hope i understand the concept and i try to answer the different aspects.

First, I can agree with your proposal to check first the parent processes of the same project of the child process and then parent processes of other projects (https://github.com/kitodo/kitodo-production/issues/5626#issuecomment-1534203324). We will hopefully have only one parent process with the same identifier - therefore it should not affect us in the SLUB Dresden.

Second, I would support the following options:

  1. Eliminate the filter for the project id during creation of the processes. Goal: If a parent process with the same id exists in another project, the child process is linked to the existing parent process, instead of creating a “duplicate” one in the current project.
  2. Eliminate the condition that the user can only search for processes of projects for which he has permissions. Goal: It must be possible that during the creation of processes, the processes can be linked – even if the user has not the permissions for the project of the parent process. Otherwise unwanted duplicates may be created because existing processes are not found.
  3. Keep the condition that the user cannot edit the process, if he has not the permissions of the project. Goal: The project restrictions should not to be completely eliminated - usually, there is a reason for it.
  4. Keep the condition that the user can only search for processes of the client in which he is logged in. Goal: There should be no linking between processes of clients.

If we agree on that, the question remains, who decides, if or how the changes are applied.

solth commented 1 year ago

@solth : Would you accept the change if the filter for project-id (#5626 (comment)) is removed? If not, we would create an SLUB-internal solution if necessary. I don't know how we can come to a conclusion otherwise.

@andre-hohmann sorry for my late response. I agree that the restriction to parent processes in the same project should be removed to avoid creating duplicate processes within the same client. I do not remember why that restriction was added in the first place.

I also agree that the restriction to processes within the same client should not be touched. Data from individual clients should ideally be totally separated from each other.

I am just not sure about the following two points you made:

  • Eliminate the condition that the user can only search for processes of projects for which he has permissions. Goal: It must be possible that during the creation of processes, the processes can be linked – even if the user has not the permissions for the project of the parent process. Otherwise unwanted duplicates may be created because existing processes are not found.

In my opinion, the restriction to processes of projects to which the current user has access should not be removed, since this would undermine the current permission system and would probably produce difficult to find errors later on. Technically, it's also in contrast with your next point:

  • Keep the condition that the user cannot edit the process, if he has not the permissions of the project. Goal: The project restrictions should not to be completely eliminated - usually, there is a reason for it.

This won't work if we really remove the restriction to projects to which the user has access, because linking processes will change the parent process - specifically, it will add a METS pointer to the child process, hence changing the parent process' data.

Therefore I would opt for keeping the strict limitation to projects to which the user has access. In my opinion making sure a user has sufficient access to all projects for which he might digitization processes is the responsibility of the organisation, not the software. Weakining the permission system or working around it instead is not the right way to solve this, I think.

The best - and probably most expensive - way to solve this at least for manual linking would be to display the potential parent processes of unaccessible projects in the "title link" tab, but make them non-selectable, with a mark and tooltip explaining as to why they cannot be linked by the user in the current state (e.g. he does not have access to the corresponding parent process' project). Then, the process creation could be canceled, the user is added to the project in question by the admininstrator according to the requirements and then the user could create the new process with valid link to the parent in another project.

For automatic linking during hierarchical import this would probably have to be extended to warn the user that a potential parent process exists in another, unaccessible project. This should ideally give the user the option to cancel the import, so that the required permission adjustments decribed above can be performend, or to willingly create a new, duplicate process without changes to the project permission structure.

andre-hohmann commented 1 year ago

@solth : thanks for your extensive reply. We should discuss in:

Sorry for closing the discussion too hastily.