Proposal

Currently, Mork handles the identification and deletion of inactive users in edX databases. However, other FUN applications (Ashley, Joanie etc) also contain user data that needs to be deleted/anonymized. These applications can run in different namespaces. Each application should be responsible for deleting its data, but Mork should be the source that informs each application which users need to be deleted.

To do so, 3 potential architectures.

1. Message Queue architecture

Mork publishes the list of users to be deleted in a message broker (Kafka, RabbitMQ).

Not retained because:

no need for real-time processing
added complexity for deployment and maintenance

2. Central datastore architecture

Mork stores a list of users to be deleted in a Redis, other applications read and pop those entries in the Redis.

Not retained because:

application can be in a different namespace than the Redis
hard to keep logs of what has been done by each apps

3. Central API service (in my opinion what we should do)

Mork exposes endpoints to query users that need to be deleted. Each application periodically checks and processes its own data deletion.

I think this is the best solution because:

there is a clear separation of responsibilities
works across namespace
easy to keep logs
no additional infra (message broker or Redis) needed

Proposed implementation

Each application runs a daily cronjob that pull from Mork the list of users to be deleted, then confirm which users it has deleted by updating a status. Mork exposes some the following endpoints:

List users to be deleted

Endpoint: GET /api/v1/users
Description: Retrieves a list of users to be deleted
Query Parameters:
- status (required): one of to_delete, deleted
- limit (optional): Number of results per page (default: 100)
- offset (optional): Offset for pagination (default: 0)
- Response: 200 OK
```
{
"users": [
{
"id": "id123",
"username": "user123",
"email": "user123@example.com",
...,
}
},
],
"total": 1500,
"limit": 100,
"offset": 0
}
```

Update User delete status

Endpoint: PATCH /api/v1/users/{user_id}/status
Description: Updates the deletion status of a user for a specific app
Request Body:
```
{
"status": "deleted"
}
```

Response: 200 OK


{
"user_id": "id123",
"status": "deleted",
"timestamp": "2024-10-21T14:30:00Z"
}

Batch Update user deletion status

Endpoint: PATCH /api/v1/users/status
Description: Updates the deletion status of multiple users for specific apps

Request Body:

{
"updates": [
  { 
    "user_id": "id123",
    "status": "deleted"
  },
  {
    "user_id": "id456",
    "status": "deleted"
  }
]
}