Closed joppevos closed 7 months ago
🤖
Hello @joppevos,
I'm Dosu, a bot designed to assist with issues and questions related to the Cosmos repository. I can help you troubleshoot bugs, answer queries, and guide you on becoming a contributor. While we wait for a human maintainer, I'm here to provide immediate support.
I see you've posted an issue, and I'm currently reviewing it. I'll provide you with a thorough response soon. Hang tight!
🤖
Hello @joppevos!
I'm Dosu, a bot here to assist you with your issues, answer your questions, and guide you on your journey to becoming a contributor. While we wait for a human maintainer, please feel free to ask anything you need help with. I'm here to make your experience smoother!
Thank you for bringing this to our attention. The issue you're experiencing is due to the way the logs are currently formatted in the Cosmos codebase. The logs are formatted using Python's built-in logging module and Airflow's CustomTTYColoredFormatter, which doesn't preserve the newline characters from the original dbt output.
To address this, you could modify the get_logger
function in the cosmos/log.py
file to include newline characters in the log format. However, this might not be the best solution as it could potentially affect other parts of the logging system.
A better approach might be to modify the parse_output
and extract_log_issues
functions in the cosmos/dbt/parser/output.py
file to preserve the newline characters from the dbt output. These functions are responsible for parsing the dbt output and extracting relevant information. By modifying these functions, you can ensure that the newline characters are preserved only in the dbt output and not in other parts of the logs.
Here's a rough idea of how you could modify the parse_output
function:
def parse_output(result: FullOutputSubprocessResult, keyword: str) -> int:
output = result.output
try:
num = int(output.split(f"{keyword}=")[1].split()[0])
except ValueError:
logging.error(
f"Could not parse number of {keyword}s. Check your dbt/airflow version or if --quiet is not being used"
)
return output.replace('\x1b[0m', '\n')
And the extract_log_issues
function:
def extract_log_issues(log_list: List[str]) -> Tuple[List[str], List[str]]:
test_names = []
test_results = []
pattern1 = re.compile(r"\d{2}:\d{2}:\d{2}\s+Warning in test ([\w_]+).*")
pattern2 = re.compile(r"\d{2}:\d{2}:\d{2}\s+(.*)")
for line_index, line in enumerate(reversed(log_list)):
cleaned_line = clean_line(line)
if "Finished running" in cleaned_line:
break
if "Warning in test" in cleaned_line:
test_name = pattern1.sub(r"\1", cleaned_line)
test_result = pattern2.sub(r"\1", clean_line(log_list[-(line_index + 1) + 1]))
test_names.append(test_name)
test_results.append(test_result.replace('\x1b[0m', '\n'))
return test_names, test_results
Please note that this is just a rough idea and the actual implementation might be different. You would need to test these changes thoroughly to ensure they don't introduce any new issues.
I hope this helps! If you have any further questions, feel free to ask.
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
Hi @joppevos , thanks for raising this. We'd appreciate your contribution!
Hi, @joppevos,
I'm helping the Cosmos team manage their backlog and am marking this issue as stale. From what I understand, you raised the issue regarding the difficulty in reading the stderror from dbt in Airflow logs due to it being displayed on a single line. Dosu-bot provided a detailed response suggesting potential modifications to the cosmos/log.py
and cosmos/dbt/parser/output.py
files to preserve the newline characters from the dbt output. Tatiana also acknowledged the issue and expressed appreciation for potential contributions.
Could you please confirm if this issue is still relevant to the latest version of the Cosmos repository? If it is, please let the Cosmos team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.
Thank you!
The stderror from dbt is displayed on a single line in Airflow logs. This makes it difficult to read what is going on. We get back the new lines to display it just as DBT does on CLI.
This is part of the log that is outputted on the single line:
airflow.exceptions.AirflowException: ('dbt command failed. The command returned a non-zero exit code 1. Details: ', '\x1b[0m07:55:11 Running with dbt=1.5.0', '\x1b[0m07:55:14 Unable to do partial parsing because saved manifest not found. Starting full parse.', '\x1b[0m07:55:18 Found 3 models, 25 tests, 0 snapshots, 0 analyses, 596 macros, 0 operations, 0 seed files, 2 sources, 0 exposures, 0 metrics, 0 groups', '\x1b[0m07:55:18', "\x1b[0m07:55:18 Concurrency: 3 threads (target='prod')", '\x1b[0m07:55:18',
It looks like we can use
\x1b
or the comma's to get back the new line structure.Screenshot of Airflow
I am happy to give it a shot