PrefectHQ / prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
https://prefect.io
Apache License 2.0
16.4k stars 1.59k forks source link

Cannot visualise custom logs in the UI both server and cloud #3648

Closed simone-codeluppi closed 3 years ago

simone-codeluppi commented 3 years ago

Description

Hi Thanks a lot for the great product! I am working on porting my microscopy images analysis pipeline to prefect and currently I am not able to visualise custom logs in the UI (the CloudFlowRunner logs are present). I am running prefect server on a local cluster and connect via ssh to port 8080 and 4200 to connect to the UI and the apollo server from my laptop. I also spin an agent using the specific IP prefect agent local start --api http://172.22.0.5:4200I get IP from the terminal printout of the prefect server start command. Using this setup I am able to run flows, however I am not able to catch logs. I tested different server settings in the config.tomlfile but none fixed the issue:


[server]
 [server.ui]
 apollo_url="http://0.0.0.0:4200/graphql"
[server]
 [server.ui]
 apollo_url="http://localhost:4200/graphql"

Expected Behavior

Visualise the logs generated from the task. If I understand how things work (please be patient I am a biologist) the issue may be on my side and can caused by a mistake in the identification of the correct IP to use for setting up the system. If it is the case is there a way to predefine the IP where the services are starting? If I am wrong please let me know, any help is appreciated!

Reproduction

import prefect
from prefect import task, Flow, Parameter, flatten, unmapped
from prefect.engine.executors import DaskExecutor

# MOCK TASK FUNCTION TO BUY TIME
@task(task_run_name=lambda **kwargs: f"testing-logger-writing-logs-{kwargs['x']}-suiname")
def wlog(x):
    logger = prefect.context.get("logger")
    logger.debug('i am debugging')
    # logger = prefect_logging_setup('test')
    logger.info(f'start sleep')
    time.sleep(20)
    logger.info(f'done sleep')
a = list(range(10))

# with Flow("test_running",schedule=schedule) as flow:
with Flow("logging-flow",environment=LocalEnvironment(DaskExecutor(address='tcp://193.10.16.58:18938'))) as flow:
    logger = prefect.utilities.logging.get_logger()
    logger.info('this log is generated in the flow')
    out_task = wlog.map(a)
    logger.info('done')
flow.register(project_name="test")

Environment

on premises HPC managed by HTcondor. Dask cluster spun using dask-jobqueue. { "config_overrides": { "logging": { "level": true, "log_to_cloud": true } }, "env_vars": [], "system_information": { "platform": "Linux-3.10.0-1062.18.1.el7.x86_64-x86_64-with-centos-7.8.2003-Core", "prefect_backend": "server", "prefect_version": "0.13.14", "python_version": "3.7.6" } }

Output from the slack chat

Mariia Kerimova 2 hours ago Hi Simone! To have more insight into your issue, can you import context inside of your task instead of using the one at the top of the file and provide an update? So instead of from prefect import task, context @task def a(): logger = context.get("logger") do from prefect import task @task def a(): from prefect import context logger = context.get("logger")

simone 1 hour ago Hi. Thanks for the help! I moved the context inside the task but i still cannot see the logs. :disappointed:

simone 1 hour ago if I also set log_to_cloud=true I end up getting the following error when submitting the flow CRITICAL - CloudHandler | Failed to write log with error: 400 Client Error: Bad Request for url: http://localhost:4200/graphql This is likely caused by a poorly formatted GraphQL query or mutation. GraphQL sent: query { mutation($input: write_run_logs_input!) { write_run_logs(input: $input) { success } } } variables { {“input”: {“logs”: [{“flow_run_id”: null, “task_run_id”: null, “timestamp”: “2020-11-11T17:27:24.167740+00:00", “name”: “prefect”, “message”: “this log is generated in the flow”, “level”: “INFO”, “info”: {“msg”: “this log is generated in the flow”, “levelno”: 20, “pathname”: “pysmFISH/TestLogger_flow.py”, “filename”: “TestLogger_flow.py”, “module”: “TestLogger_flow”, “exc_info”: null, “exc_text”: null, “stack_info”: null, “lineno”: 40, “funcName”: ““, “msecs”: 167.7396297454834, “relativeCreated”: 3324.2108821868896, “thread”: 139926669195072, “threadName”: “MainThread”, “processName”: “MainProcess”, “process”: 1578312, “asctime”: “2020-11-11 18:27:24+0100"}}, {“flow_run_id”: null, “task_run_id”: null, “timestamp”: “2020-11-11T17:27:24.204183+00:00", “name”: “prefect”, “message”: “done”, “level”: “INFO”, “info”: {“msg”: “done”, “levelno”: 20, “pathname”: “pysmFISH/TestLogger_flow.py”, “filename”: “TestLogger_flow.py”, “module”: “TestLogger_flow”, “exc_info”: null, “exc_text”: null, “stack_info”: null, “lineno”: 42, “funcName”: ““, “msecs”: 204.18262481689453, “relativeCreated”: 3360.653877258301, “thread”: 139926669195072, “threadName”: “MainThread”, “processName”: “MainProcess”, “process”: 1578312, “asctime”: “2020-11-11 18:27:24+0100"}}]}}

Mariia Kerimova 41 minutes ago The logs defined in with Flow block will be displayed only when you are initializing the flow, but you'll not see them during flow execution. I'll try to figure out why you can't see logs from tasks

simone 37 minutes ago correct, the ones initialised in the flow are visible when i load the flow.

Mariia Kerimova 14 minutes ago Hmm, don't see why you can't see the logs from the tasks, I would encourage you to open an issue in Github, this case should be investigated.

simone 12 minutes ago sounds good. I will

simone 12 minutes ago thanks a lot for the help!

simone-codeluppi commented 3 years ago

I run the same code through prefect cloud and also there the logs are not present

simone-codeluppi commented 3 years ago

I don't know what does it mean but after killing dockerd, restart it and run the code the logs showed up. BTW: thanks for the great product!