codalab / codabench

Codabench is a flexible, easy-to-use and reproducible benchmarking platform. Check our paper at Patterns Cell Press https://hubs.li/Q01fwRWB0
Apache License 2.0
74 stars 28 forks source link

Storage and clean up #713

Open Didayolo opened 2 years ago

Didayolo commented 2 years ago

Implementing a storage quota and effectively managing capacity could be crucial for enhancing Codabench's scalability and ensuring long-term service availability in production.

Features

Deleting files from the MinIO

@OhMaley: Add a button, in admin analytics page, to manually starts a script removing all orphans files.

Quota

We need to:

Light/archive mode after competition end

Option to clean competitions when there are completed. It would for instance keep only leaderboard submissions.

Automatic cleaning

Monitoring and statistics

Per benchmark submission size limit

Migrate the CodaLab feature on Codabench. This may be less important once the quota feature is implemented.

Interface

@ihsaan-ullah

Improve resources interface

The interface for managing datasets, submissions and tasks could be improved. Indeed, improving the resources interface is also part of helping participants to manage their storage space.

As shown in the screenshot below, we can manage and remove submissions from the "resources" interface. However, we do not track what the benchmark on which the submission was made, so it is hard to know what we are removing.

Capture d’écran 2023-04-29 à 01 41 54

Some issues

ihsaan-ullah commented 1 year ago

@Didayolo

Public datasets should have their own tab

Is this only for public datasets (now datasets and programs) or also for submissions

Right now we have these tabs:

  1. Submissions (shows my submissions and public submissions)
  2. Datasets and Programs (shows my datasets and public datasets) Note: datasets are everything except submissions
  3. Tasks
Didayolo commented 1 year ago

@ihsaan-ullah

I think my idea was about public datasets that you don't own. It should be either clearly indicated, either on a separated tab.

EDIT: oh and the problem is that it could become cluttered by public datasets in the future when we have many users. As submissions can also be made public, maybe different tabs is not the good way to go. I see that in the "Tasks" tab, there is a "Show public tasks" option. Maybe we need the same option for datasets and submissions, so we can show/hide the public ones.

Didayolo commented 1 year ago

Side remark on the resources interface, usernames could be clickable link to the profiles (when clicking on details in any tab):

Capture d’écran 2023-06-01 à 12 57 07

In the leaderboard, the clickable link is made in this way:

<a href="{submission.slug_url}">{ submission.owner }</a>

https://github.com/codalab/codabench/blob/ce98a1f49c9233119a2997721113a668f29f60af/src/static/riot/competitions/detail/leaderboards.tag#LL59C57-L59C113

ihsaan-ullah commented 1 year ago

Side remark on the resources interface, usernames could be clickable link to the profiles (when clicking on details in any tab):

Added this as one todo in the interface todos list

Didayolo commented 1 year ago

I added the same TODO in the "job status" issue #744

ihsaan-ullah commented 1 year ago
  • [x] Have an user interface to manage and delete submissions
  • Can we confirm that the submissions are in the "resources" interface even if you are not organizer of the benchmark?
  • Actually, submissions can't be removed if there are part of a benchmark (which makes sense... this needs discussion).
  • [x] Be able to delete tasks when deleting a benchmark Task deletion #810

@Didayolo I think these two issues under the Quota heading are solved in the Interface solved issues by #918

Didayolo commented 1 year ago

EDIT: it is not the case.

Didayolo commented 1 year ago

Actually, submissions can't be removed if there are part of a benchmark (which makes sense... this needs discussion).

About this point: