forTEXT / catma

Computer Assisted Text Markup and Analysis
https://www.catma.de
GNU General Public License v3.0
87 stars 8 forks source link

git submodule usage #303

Closed mpetris closed 1 year ago

mpetris commented 2 years ago

The CATMA Project structure as a Gitlab group with one git project per resource and their integration via git submodules and a git container project has proven to be a major performance bottleneck. This structure leads to a high number of git projects that need to be considered when working with one CATMA project and neither Gitlab nor git submodules seem to be optimized for the required workflows and such numbers.

The reasons for choosing this structure had been

  1. the ability to apply permissions on resource granularity
  2. Gitlab supported forks of resources across CATMA projects

The goal is therefore to change the CATMA Project structure to a single git project with the resources as ordinary folders. This single git project will then be responsable for the orchestration of resources making the use of git submodules obsolete.

This way we will loose the per resource Gitlab permissions. To compensate that we will embrace Gitlab features on branches, i. e. protected branches, branch push permissions and merge requests and we will introduce an optional weak ownership of resources:

Forks haven't been used much since they were introduced in CATMA 6 and especially the ability to merge changes back into the parent of a fork has never been used. Copying over Tagsets, Collections and Documents from one CATMA Project to another will still be important but it can't and won't be based on a Gitlab fork with our new approach sketched out above. Instead this will be implemented with a simple copy with a fresh history. The HEAD commit hash and the name of the source git repository will be recorded in the meta data of the resulting new resource in the target git repository.

maltem-za commented 1 year ago

Most changes outlined above have now been released with 7.0.0 - the page What’s New and Changed in CATMA 7 summarizes the impacts on synchronization and roles & permissions, while the page Upcoming Changes to the Backend Storage Mechanisms and Data Structures was created for those who work with raw CATMA data.

Notes on some of the above points:

Copying resources between projects (replacement for forking) has not been implemented, except for some foundational aspects.