Cloud-CV / EvalAI

:cloud: :rocket: :bar_chart: :chart_with_upwards_trend: Evaluating state of the art in AI
https://eval.ai
Other
1.77k stars 788 forks source link

Remove AWS infrastructure for code upload challenge if disapproved #4377

Open MinhThieu145 opened 5 months ago

MinhThieu145 commented 5 months ago

This pull request removes the AWS infrastructure for the code upload challenge if the challenge is disapproved by the admin.

codecov-commenter commented 5 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 69.30%. Comparing base (96968d6) to head (6c8c31e). Report is 1110 commits behind head on master.

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #4377 +/- ## ========================================== - Coverage 72.93% 69.30% -3.63% ========================================== Files 83 20 -63 Lines 5368 3574 -1794 ========================================== - Hits 3915 2477 -1438 + Misses 1453 1097 -356 ``` [see 64 files with indirect coverage changes](https://app.codecov.io/gh/Cloud-CV/EvalAI/pull/4377/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None) [see 64 files with indirect coverage changes](https://app.codecov.io/gh/Cloud-CV/EvalAI/pull/4377/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None) ------ [Continue to review full report in Codecov by Sentry](https://app.codecov.io/gh/Cloud-CV/EvalAI/pull/4377?dropdown=coverage&src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://app.codecov.io/gh/Cloud-CV/EvalAI/pull/4377?dropdown=coverage&src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None). Last update [be3c597...6c8c31e](https://app.codecov.io/gh/Cloud-CV/EvalAI/pull/4377?dropdown=coverage&src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None).
gchhablani commented 5 months ago

Here is the expectation:

See this method: https://github.com/Cloud-CV/EvalAI/blob/da7e0680c11a8e60c335c821902211a06ed581de/apps/challenges/aws_utils.py#L1361C1-L1475C15

Now, if you have delete infra, and suppose it starts with deletion of IAM, followed by deletion of EFS, here is what you would write:

@app.task
def destroy_eks_cluster(challenge):
    """
    Destroy EKS cluster. Starts with deletion of EKS and Nodegroup roles, and relays to deletion of EFS.

    Arguments:
        instance {<class 'django.db.models.query.QuerySet'>} -- instance of the model calling the post hook
    """
    from .models import ChallengeEvaluationCluster
    from .serializers import ChallengeEvaluationClusterSerializer
    from .utils import get_aws_credentials_for_challenge

    for obj in serializers.deserialize("json", challenge):
        challenge_obj = obj.object
    challenge_aws_keys = get_aws_credentials_for_challenge(challenge_obj.pk)
    client = get_boto3_client("iam", challenge_aws_keys)
    eks_role_arn = ...
    try:    
        <LOGIC FOR EKS CLUSTER ROLE DELETION>
    except ClientError as e:
        logger.exception(e)
        return
    waiter = client.get_waiter("role_deleted") # TODO: Find correct argument
    waiter.wait(<TODO>)

    node_group_role_name = "evalai-code-upload-nodegroup-role-{}".format(
        environment_suffix
    )
    node_group_arn_role = ...
    try:    
        <LOGIC FOR EKS NODEGROUP ROLE DELETION>
    except ClientError as e:
        logger.exception(e)
        return
    waiter = client.get_waiter("role_exists")
    waiter.wait(RoleName=node_group_role_name)

    # Delete custom ECR all access policy and attach to node_group_role
    ecr_all_access_policy_arn = ...
    try:
        <LOGIC TO DELETE CUSTOM ECR ALL ACCESS POLICY>
    except ClientError as e:
        logger.exception(e)
        return
   # Remove these details from the evaluation cluster on backend
    try:
        challenge_evaluation_cluster = ChallengeEvaluationCluster.objects.get(
            challenge=challenge_obj
        )
        serializer = ChallengeEvaluationClusterSerializer(
            challenge_evaluation_cluster,
            data={
                "eks_arn_role": '',
                "node_group_arn_role": '',
                "ecr_all_access_policy_arn": '',
            },
            partial=True,
        )
        if serializer.is_valid():
            serializer.save()
        # Delete efs
        delete_efs.delay(challenge)
    except Exception as e:
        logger.exception(e)
        return