NASA-PDS / nucleus

Nucleus is a software platform used to create workflows for the Planetary Data (PDS).
https://nasa-pds.github.io/nucleus
Apache License 2.0
0 stars 0 forks source link

Display the logs of ECS Tasks in Airflow UI #77

Closed ramesh-maddegoda closed 10 months ago

ramesh-maddegoda commented 11 months ago

At the moment the logs of ECS Tasks are not displayed in Airflow UI. It is required to check CloudWatch logs or ECS Task logs to read these logs.

It si required to display ECS Task logs in Airflow UI.

tloubrieu-jpl commented 10 months ago

not seeing the ECS logs in the docker logs is related to the number of lines of logs allowed (which is currently 10 only) and other configuration, that @ramesh-maddegoda tested.

@ramesh-maddegoda will document how to set these confguration in the DAGs.

ramesh-maddegoda commented 10 months ago

Updated the tutorial DAG as an example and also updated the tutorial with details.

Pull request: https://github.com/NASA-PDS/nucleus/pull/78/commits/a6ef5ea83b25d1ce6bc3760c73d47923dc7a3a54

Example:


       # PDS Validate Tool
       pds_validate = ECSOperator(
           task_id="Validate_Task",
           dag=dag,
           cluster=ECS_CLUSTER_NAME,
           task_definition="pds-validate-tutorial-task",
           launch_type=ECS_LAUNCH_TYPE,
           network_configuration={
               "awsvpcConfiguration": {
                   "securityGroups": ECS_SECURITY_GROUPS,
                   "subnets": ECS_SUBNETS,
               },
           },
           overrides={
               "containerOverrides": [],
           },
           awslogs_group="/pds/ecs/validate",
           awslogs_stream_prefix="ecs/validate",
           awslogs_fetch_interval=timedelta(seconds=1),
           number_logs_exception=500
       )
The number_logs_exception=500 is the above code will show maximum 500 lines for the related CloudWatch log.
Make sure to have the awslogs_group and awslogs_stream_prefix in the above ECSOperator definition
to match with the awslogs_group and awslogs_stream_prefix used by the related ECS task to write logs.
In a way, this is how you instruct the ECSOperator to find the log location to read from.