[](https://pullreminders.com?ref=badge)
helm-chart
helm charts for publishing with chartpressgenerators
a set of common use scalacheck generatorstiny-types
a module containing tooling for Tiny Typesgraph-commons
common classes for all the servicesrenku-model
defines both production and testing Renku metadata modelwebhook-service
a microservice managing GitLab hooks and incoming external eventsevent-log
a microservice responsible for events managementcommit-event-service
a microservice synchronizing commit events between KG and GitLabtriples-generator
a microservice generating, transforming and taking care of data in the Triples Storetoken-repository
a microservice managing projects' Access Tokensacceptance-tests
acceptance tests for the servicessbt clean test && sbt "project acceptance-tests" test
Depending on your global configuration of sbt you have installed, you might need to set SBT_OPTS
to avoid OutOfMemory exception.
If such error is raised, try setting the variable with the following:
export SBT_OPTS="-Xmx2G -Xss5M"
Renk Graph was built with code readability and maintainability as a value. We believe that high coding standards can:
Hence, we are trying to find and then follow good patterns in naming, code organization on a method, class, package and module level. The following list has a work-in-progress style is it supposed to be in constant improvement.
camelCase
notation is used everywhere in Scala code;class
es should rather have noun names;class
es names should rather not be comprised of more than three words;def
s should be verbs;def
s should rather be very short (let's keep it an exception for a def
having more than 10 lines);class
es should rather be short;class
es and def
s should be having single purpose;if
s or any other structures should not exceed three levels; preferable one level of nesting should be used;implicit
s and Context Bound should/may be used extensively but wisely;show
String Interpolator should be the first choice over the s
and toString
; .scalafmt.conf
file;The standard release process is done manually. There are multiple repositories taking part in the process. The renku project contains helm charts for deploying to kubernetes and the acceptance tests. The terraform-renku project contains deployment descriptions for all environments.
renku-graph project:
origin/release
branch, name it
prep-<version>
git merge origin/development
)renku project:
CHANGELOG.rst
helm-chart/renku/CHart.yaml
renku-graph
afterwards.terraform-renku:
Cleanup (renku-graph):
release
branch back into development
development
branchIn a case of hotfixes, changes to a relevant commit/tag needs to be done and pushed to a special branch with name
following the hotfix-<major>.<minor>
pattern. Once the fix is pushed, CI will test the change with other Renku
services. Tagging has to be done manually.
This section describes the flow of events starting from a commit on GitLab until the data is stored in the triples store. The solid lines represent an event being sent and the dotted lines represent non-event-like data (request or response).
The assumption is that the Project already exists in GitLab.
sequenceDiagram
participant UI
participant WebhookService
participant GitLab
participant TokenRepository
participant EventLog
UI ->> WebhookService: POST /projects/:id/webhooks
activate WebhookService
WebhookService ->> GitLab: Create a KG webhook
WebhookService ->> TokenRepository: PUT /projects/:id/tokens
WebhookService ->> EventLog: sends COMMIT_SYNC_REQUEST
WebhookService ->> UI: 200/201
deactivate WebhookService
The assumption is that there's Renku Webhook for a Project created and GitLab sends a Push Event for the project.
sequenceDiagram
participant GitLab
participant WebhookService
participant EventLog
GitLab ->> WebhookService: POST /webhooks/events
WebhookService ->> EventLog: sends COMMIT_SYNC_REQUEST
This flow traverses the commit history for a Project in GitLab until it finds a commit EventLog knows about.
sequenceDiagram
participant EventLog
participant CommitEventService
participant CommitEventService
participant GitLab
EventLog ->> CommitEventService: sends COMMIT_SYNC
activate CommitEventService
CommitEventService ->> TokenRepository: fetches access token
CommitEventService ->> GitLab: finds commits which are not in EventLog
CommitEventService ->> EventLog: sends CREATION for all commits that are not in EventLog
CommitEventService ->> EventLog: sends EVENTS_STATUS_CHANGE (to: AWAITING_DELETION) for all commits that are in EventLog but not in GitLab
CommitEventService ->> EventLog: sends GLOBAL_COMMIT_SYNC_REQUEST if at least one AWAITING_DELETION or CREATION was found
deactivate CommitEventService
This flow traverses the whole commit history of a Project and find out:
Eventlog
EventLog
This process is scheduled to be triggered at a minimum rate of once per week per project and at a maximum rate of once
per hour per project. The commit history traversal only begins when the number of commits on GitLab and on
the EventLog
does not match and the most recent commit on GitLab is different from the most recent commit on
the EventLog
.
sequenceDiagram
participant EventLog
participant CommmitEventService
participant CommitEventService
participant GitLab
EventLog ->> CommmitEventService: GLOBAL_COMMIT_SYNC
activate CommitEventService
CommitEventService ->> GitLab: finds out the last commit ID and the total number of commits
loop if the last commit ID or the total number of commits do not match with EventLog state find all the differences
CommitEventService ->> TokenRepository: fetches access token
CommitEventService ->> GitLab: get all commits
CommitEventService ->> EventLog: get all commits
CommitEventService ->> EventLog: sends CREATION for all commits that are not in EventLog
CommitEventService ->> EventLog: sends EVENTS_STATUS_CHANGE (to: AWAITING_DELETION) for all commits that are in EventLog but not in GitLab
end
deactivate CommitEventService
The assumption is the latest Commit Event for a Project in EventLog is in status 'NEW'
sequenceDiagram
participant EventLog
participant TriplesGenerator
participant TokenRepository
participant GitLab
participant CLI
participant TriplesStore
EventLog ->> TriplesGenerator: sends AWAITING_GENERATION
activate TriplesGenerator
TriplesGenerator ->> TokenRepository: fetches access token
TriplesGenerator ->> GitLab: clones the project
TriplesGenerator ->> CLI: renku migrate
TriplesGenerator ->> CLI: renku graph export
TriplesGenerator ->> EventLog: sends EVENTS_STATUS_CHANGE (to: TRIPLES_GENERATED) with the graph as payload
deactivate TriplesGenerator
EventLog ->> TriplesGenerator: sends TRIPLES_GENERATED
activate TriplesGenerator
TriplesGenerator ->> TokenRepository: fetches access token
TriplesGenerator ->> GitLab: calls several APIs in the Transformation process
TriplesGenerator ->> TriplesStore: execute update queries and uploads project metadata
TriplesGenerator ->> EventLog: sends EVENTS_STATUS_CHANGE (to: TRIPLES_STORE)
deactivate TriplesGenerator
The assumption is that there was a git reset hard
or git rebase
done on the Project
sequenceDiagram
participant EventLog
participant TriplesGenerator
participant TokenRepository
participant GitLab
participant TriplesStore
EventLog ->> TriplesGenerator: sends CLEAN_UP_REQUEST
activate TriplesGenerator
TriplesGenerator ->> TokenRepository: fetches access token
TriplesGenerator ->> TriplesStore: remove the data of a Project
TriplesGenerator ->> EventLog: sends EVENTS_STATUS_CHANGE (to: NEW) of all the event of a single Project
deactivate TriplesGenerator
activate EventLog
EventLog ->> EventLog: remove all events in status AWAITING_DELETION and DELETING
loop if there are no events left for the Project
EventLog ->> EventLog: remove the Project
EventLog ->> TokenRepository: remove the Project token
EventLog ->> GitLab: remove the Project WebHook
end
EventLog ->> EventLog: change status of all Project events to NEW
EventLog ->> TriplesGenerator: sends AWAITING_GENERATION
deactivate EventLog
The assumption is that there's no Commit Event in TRIPLES_STORE status for a Project
sequenceDiagram
participant EventLog
participant TriplesGenerator
participant TokenRepository
participant GitLab
participant TriplesStore
EventLog ->> TriplesGenerator: sends ADD_MIN_PROJECT_INFO
activate TriplesGenerator
TriplesGenerator ->> TokenRepository: fetches access token
TriplesGenerator ->> GitLab: calls several APIs in the Transformation process
TriplesGenerator ->> TriplesStore: execute update queries and uploads project metadata
TriplesGenerator ->> EventLog: sends EVENTS_STATUS_CHANGE (to: TRIPLES_STORE)
deactivate TriplesGenerator
This event is sent periodically to sync authorization data between GitLab and Triples Store
sequenceDiagram
participant EventLog
participant TriplesGenerator
participant TokenRepository
participant GitLab
participant TriplesStore
EventLog ->> TriplesGenerator: sends MEMBER_SYNC
activate TriplesGenerator
TriplesGenerator ->> TokenRepository: fetches access token
TriplesGenerator ->> GitLab: calls the Project users and Project members APIs
TriplesGenerator ->> TriplesStore: project members
deactivate TriplesGenerator
This event is sent periodically to sync Project data between GitLab, EventLog and Triples Store
sequenceDiagram
participant EventLog
participant CommitEventService
participant TriplesGenerator
participant TokenRepository
participant GitLab
participant TriplesStore
EventLog ->> EventLog: sends PROJECT_SYNC
activate EventLog
EventLog ->> TokenRepository: fetches access token
EventLog ->> GitLab: calls the Project Details
loop if the project slug is NOT the same in EventLog and GitLab
EventLog ->> CommitEventService: sends COMMIT_SYNC for the new slug
EventLog ->> TriplesGenerator: sends CLEAN_UP_REQUEST for the old slug
end
EventLog ->> TriplesGenerator: sends SYNC_REPO_METADATA
activate TriplesGenerator
TriplesGenerator ->> GitLab: fetches project metadata
TriplesGenerator ->> TriplesStore: fetches project metadata
TriplesGenerator ->> EventLog: fetches the payload of the latest project event
TriplesGenerator ->> TriplesStore: sends update queries if values needs updating (not for visibility changes)
TriplesGenerator ->> EventLog: sends RedoProjectTransformation (only when visibility changes)
deactivate TriplesGenerator
deactivate EventLog
This event category detects Commit Events that got stale.
sequenceDiagram
participant EventLog
participant TriplesGenerator
loop finds out events that are marked as under processing but the process was interrupted
activate EventLog
EventLog ->> TriplesGenerator: verifies if instance with given URL and identifier exist
EventLog ->> EventLog: sends ZOMBIE_CHASING
deactivate EventLog
EventLog ->> EventLog: sends EVENTS_STATUS_CHANGE (to: NEW | TRIPLES_GENERATED)
end
Once an event is marked as AwaitingDeletion it is automatically picked up by our process and a CleanUp event is created. This event triggers the removal of the project in the Triple Store. The clean up of a project can be either the removal of the projects with all its events and entities (if the project was removed from GitLab) or the re-provisioning of the project (if there are events which are not AwaitingDeletion).
The removal of project triples happens in two steps:
Updating links happens in order to not create island in our graph. An example would be with a hierarchy of forked projects:
project1 <-- project2 <-- project3
If we wanted to remove project2 we would have to re-link project3 to project1.
project1 <-- project3
The update of the links would also be applied to the Dataset entities which could be imported from other Datasets(similar to a fork for a project).
After the re-linking, the project and all its dependant entities can be removed. These entities will be removed only if they are not used in another project.