apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
62.8k stars 13.87k forks source link

Seeing a bug with csv or text alerts on 3.0.0rc1, which were working fine on 2.1.0. - Same issue version 4.0.2 #30368

Open tamarinkeisari opened 1 month ago

tamarinkeisari commented 1 month ago

Seeing a bug with csv or text alerts on 3.0.0rc1, which were working fine on 2.1.0. image PNG is working fine on 3.0.0rc1 (which rules out auth/selenium issues), but CSV and text are failing with:

A downstream exception occurred while generating a report: b056bed4-ff76-4e1d-aa0f-0fa7751d3542. Failed generating csv HTTP Error 400: BAD REQUEST
Traceback (most recent call last):
  File "/app/superset/reports/commands/execute.py", line 262, in _get_csv_data
    csv_data = get_chart_csv_data(chart_url=url, auth_cookies=auth_cookies)
  File "/app/superset/utils/csv.py", line 91, in get_chart_csv_data
    response = opener.open(chart_url)
  File "/usr/local/lib/python3.9/urllib/request.py", line 523, in open
    response = meth(req, response)
  File "/usr/local/lib/python3.9/urllib/request.py", line 632, in http_response
    response = self.parent.error(
  File "/usr/local/lib/python3.9/urllib/request.py", line 561, in error
    return self._call_chain(*args)
  File "/usr/local/lib/python3.9/urllib/request.py", line 494, in _call_chain
    result = func(*args)
  File "/usr/local/lib/python3.9/urllib/request.py", line 641, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: BAD REQUEST

Originally posted by @padbk in https://github.com/apache/superset/discussions/24581#discussioncomment-6425918

I'm encountering the same error in superset version 4.0.2, Can someone please help?

dosubot[bot] commented 1 month ago

Hey there, @tamarinkeisari! 👋 I'm here to help you with any bugs, questions, or even becoming a contributor. Let's squash those bugs together!

I found a similar unsolved discussion regarding CSV alerts resulting in an HTTP 500 error, which might be related to your issue. You can check it out here [1].

The issue with CSV and text alerts failing with an HTTP 400 error in Superset versions 3.0.0rc1 and 4.0.2 is likely due to the _get_csv_data and _get_embedded_data methods returning None. This results in an "Unexpected missing csv file" or "Unexpected missing embedded data" error. Here is a relevant code snippet:

if (
    feature_flag_manager.is_feature_enabled("ALERTS_ATTACH_REPORTS")
    or self._report_schedule.type == ReportScheduleType.REPORT
):
    if self._report_schedule.report_format == ReportDataFormat.PNG:
        screenshot_data = self._get_screenshots()
        if not screenshot_data:
            error_text = "Unexpected missing screenshot"
    elif self._report_schedule.report_format == ReportDataFormat.PDF:
        pdf_data = self._get_pdf()
        if not pdf_data:
            error_text = "Unexpected missing pdf"
    elif (
        self._report_schedule.chart
        and self._report_schedule.report_format == ReportDataFormat.CSV
    ):
        csv_data = self._get_csv_data()
        if not csv_data:
            error_text = "Unexpected missing csv file"
    if error_text:
        return NotificationContent(
            name=self._report_schedule.name,
            text=error_text,
            header_data=header_data,
        )

if (
    self._report_schedule.chart
    and self._report_schedule.report_format == ReportDataFormat.TEXT
):
    embedded_data = self._get_embedded_data()

The error_text is set when csv_data or embedded_data is None, leading to the failure of the alert [2].

Additionally, the get_chart_csv_data function in superset/utils/csv.py is responsible for fetching the CSV data. Here is its implementation:

def get_chart_csv_data(
    chart_url: str, auth_cookies: Optional[dict[str, str]] = None
) -> Optional[bytes]:
    content = None
    if auth_cookies:
        opener = urllib.request.build_opener()
        cookie_str = ";".join([f"{key}={val}" for key, val in auth_cookies.items()])
        opener.addheaders.append(("Cookie", cookie_str))
        response = opener.open(chart_url)
        content = response.read()
        if response.getcode() != 200:
            raise URLError(response.getcode())
    if content:
        return content
    return None

This function attempts to fetch the CSV data from the provided chart_url using authentication cookies if available. If the response code is not 200, it raises an HTTPError [3].

To gain more insight into why the HTTP 400 error is occurring, you can leverage the additional logging and error handling introduced in newer versions of Superset. The handle_api_exception decorator and the set_app_error_handlers function include detailed logging for various exceptions, including HTTPException, which would cover HTTP 400 errors [4][5][6].

These enhancements in error handling and logging should provide more detailed insights into the causes of HTTP 400 errors.

To continue talking to Dosu, mention @dosu.

sfirke commented 1 month ago

Would you mind creating a new bug report that follows the template? That would make this a more actionable bug report. Right now all the info is from an old comment about old versions of Superset and that comment was addressed by a pull request already. So it would be helpful if you can describe steps to reproduce this problem in 4.0.2. With no new info, it's hard to do anything about this.

If you do so, you can close this bug report and comment with a link to the new one.