Open ihsaan-ullah opened 1 year ago
Some remarks:
The goal of this feature was primarly the convenience. Indeed, most of the organizers that tried Codabench reported this problem: if you just want to update one program or one dataset, you need to create a task from scratch, selecting all programs and data, and then assign it to all benchmarks / phases needed. This process takes time and can be the source of many mistakes.
So, I do think we need to be able to edit tasks. Instead of being able to edit only validated tasks, we could say that editing a validated task un-validate the task.
I think it is a cool idea to validate the tasks, typically, when a submission was successful.
However, running the solution to validate the task may be problematic. Indeed, to run this validation, you'll need some parameters that are external to the task itself: the queue, the docker image, the time limit, etc. If you need to specify all this to run the validation, then it is complex, and does not guarantee that the task will work in any benchmark configuration anyway.
So, I think tasks should be validated by a submission on a specific benchmark (with its own setup of queue, docker, etc.). OR we could move these settings from the benchmark models to the task models. This could make sense too, and offer interesting possibilities such as different docker images for each task, etc.
We need to check the purpose of solutions
it looks like the only purpose of solution is to validate a task. If that is the case and it is decided that task validation should be done with submissions, then keeping solutions should have another purpose
Solutions, if considered a mini starting kits, are still kind of useless because now users can add starting kits
Notion of locking / unlocking task?
The idea: If a task is public, or used by benchmarks (or a submissions?), then it would not be editable.
From Ihsan:
If you remove a task from a competition and it is not used in another competition, you can delete the datasets used in this task. Is this suppose to happen?
After deleting all the datasets/programs. you have an empty task
In my opinion, we need to keep the edit task feature as it is really convenient in many scenarios. If there is a drastic change in the programs, it is the responsibility of the organizers to re-run the submissions.
A compromise is that we could add warning to submissions after an edition of the task, saying something like Warning: this submission was computed on an older setting
when the cursor is on it:
The way task validation status is returned is weird.
Task serializer is always returning False
because task has no filed with the name validated
.
Task model has a filed named _validated
which is computed in a weird way AND from the comment in the code it looks like it is not complete.
Task model:
https://github.com/codalab/codabench/blob/de55ef3fc3642194d194fb79ab61aad6a3e25d6d/src/apps/tasks/models.py#L30
For now task validation tests are always passing because the tests and this validation is kind of hard coded. Both should be fixed
A
Task
in Codabench is considered atomic. A discussion is needed for the following:Some discussion below:
Task Validation
A task is considered
validated
when one of the following happens:After either of the above is done i.e.
Task
is validated, this task cannot be updated anymore. If event after task needs to be updated, this should be discussed and planned properly. Some suggestion below:Task Updation
A task can be updated only if it is not validated. How can a task be modified:
More:
Requests:
Several people requested the discussed features. They should be invited to the discussion:
Related issues and PRs:
742
1070
1071
1072