googleapis / python-bigquery-pandas

Google BigQuery connector for pandas
https://googleapis.dev/python/pandas-gbq/latest/index.html
BSD 3-Clause "New" or "Revised" License
447 stars 121 forks source link

tests.system.test_to_gbq: test_series_round_trip[load_parquet-input_series0-api_methods0] failed #501

Closed flaky-bot[bot] closed 1 year ago

flaky-bot[bot] commented 2 years ago

This test failed!

To configure my behavior, see the Flaky Bot documentation.

If I'm commenting on this issue too often, add the flakybot: quiet label and I will stop commenting.


commit: ae6ad6b38c5f2230820b7935080a64f499e1c11b buildURL: Build Status, Sponge status: failed

Test output
dataframe =        test_col     test_col2
0  1.428571e-01  1.428571e-01
1  4.406780e-01  4.406780e-01
2  1.051480e+00  1.051480e+0...4  1.857143e+00  1.857143e+00
5  2.718282e+00  2.718282e+00
6  3.141593e+00  3.141593e+00
7  2.098894e+43  2.098894e+43
destination_table = 'python_bigquery_pandas_tests_system_20220319131718_dafd1b.round_trip_829808'
project_id = 'precise-truck-742', chunksize = None, reauth = False
if_exists = 'fail', auth_local_webserver = True
table_schema = {'fields': [{'name': 'test_col', 'type': 'FLOAT'}, {'name': 'test_col2', 'type': 'FLOAT'}]}
location = None, progress_bar = True
credentials = 
api_method = 'load_parquet', verbose = None, private_key = None

    def to_gbq(
        dataframe,
        destination_table,
        project_id=None,
        chunksize=None,
        reauth=False,
        if_exists="fail",
        auth_local_webserver=True,
        table_schema=None,
        location=None,
        progress_bar=True,
        credentials=None,
        api_method: str = "default",
        verbose=None,
        private_key=None,
    ):
        """Write a DataFrame to a Google BigQuery table.

        The main method a user calls to export pandas DataFrame contents to
        Google BigQuery table.

        This method uses the Google Cloud client library to make requests to
        Google BigQuery, documented `here
        `__.

        See the :ref:`How to authenticate with Google BigQuery `
        guide for authentication instructions.

        Parameters
        ----------
        dataframe : pandas.DataFrame
            DataFrame to be written to a Google BigQuery table.
        destination_table : str
            Name of table to be written, in the form ``dataset.tablename`` or
            ``project.dataset.tablename``.
        project_id : str, optional
            Google Cloud Platform project ID. Optional when available from
            the environment.
        chunksize : int, optional
            Number of rows to be inserted in each chunk from the dataframe.
            Set to ``None`` to load the whole dataframe at once.
        reauth : bool, default False
            Force Google BigQuery to re-authenticate the user. This is useful
            if multiple accounts are used.
        if_exists : str, default 'fail'
            Behavior when the destination table exists. Value can be one of:

            ``'fail'``
                If table exists, do nothing.
            ``'replace'``
                If table exists, drop it, recreate it, and insert data.
            ``'append'``
                If table exists, insert data. Create if does not exist.
        auth_local_webserver : bool, default True
            Use the `local webserver flow
            `_
            instead of the `console flow
            `_
            when getting user credentials. Your code must run on the same machine
            as your web browser and your web browser can access your application
            via ``localhost:808X``.

            .. versionadded:: 0.2.0
        table_schema : list of dicts, optional
            List of BigQuery table fields to which according DataFrame
            columns conform to, e.g. ``[{'name': 'col1', 'type':
            'STRING'},...]``.  The ``type`` values must be BigQuery type names.

            - If ``table_schema`` is provided, it may contain all or a subset of
              DataFrame columns. If a subset is provided, the rest will be
              inferred from the DataFrame dtypes.  If ``table_schema`` contains
              columns not in the DataFrame, they'll be ignored.
            - If ``table_schema`` is **not** provided, it will be
              generated according to dtypes of DataFrame columns. See
              `Inferring the Table Schema
              `__.
              for a description of the schema inference.

            See `BigQuery API documentation on valid column names
            __.

            .. versionadded:: 0.3.1
        location : str, optional
            Location where the load job should run. See the `BigQuery locations
            documentation
            `__ for a
            list of available locations. The location must match that of the
            target dataset.

            .. versionadded:: 0.5.0
        progress_bar : bool, default True
            Use the library `tqdm` to show the progress bar for the upload,
            chunk by chunk.

            .. versionadded:: 0.5.0
        credentials : google.auth.credentials.Credentials, optional
            Credentials for accessing Google APIs. Use this parameter to override
            default credentials, such as to use Compute Engine
            :class:`google.auth.compute_engine.Credentials` or Service Account
            :class:`google.oauth2.service_account.Credentials` directly.

            .. versionadded:: 0.8.0
        api_method : str, optional
            API method used to upload DataFrame to BigQuery. One of "load_parquet",
            "load_csv". Default "load_parquet" if pandas is version 1.1.0+,
            otherwise "load_csv".

            .. versionadded:: 0.16.0
        verbose : bool, deprecated
            Deprecated in Pandas-GBQ 0.4.0. Use the `logging module
            to adjust verbosity instead
            `__.
        private_key : str, deprecated
            Deprecated in pandas-gbq version 0.8.0. Use the ``credentials``
            parameter and
            :func:`google.oauth2.service_account.Credentials.from_service_account_info`
            or
            :func:`google.oauth2.service_account.Credentials.from_service_account_file`
            instead.
        """

        _test_google_api_imports()

        from google.api_core import exceptions as google_exceptions
        from google.cloud import bigquery

        if verbose is not None and FEATURES.pandas_has_deprecated_verbose:
            warnings.warn(
                "verbose is deprecated and will be removed in "
                "a future version. Set logging level in order to vary "
                "verbosity",
                FutureWarning,
                stacklevel=1,
            )

        if api_method == "default":
            # Avoid using parquet if pandas doesn't support lossless conversions to
            # parquet timestamp. See: https://stackoverflow.com/a/69758676/101923
            if FEATURES.pandas_has_parquet_with_lossless_timestamp:
                api_method = "load_parquet"
            else:
                api_method = "load_csv"

        if chunksize is not None:
            if api_method == "load_parquet":
                warnings.warn(
                    "chunksize is ignored when using api_method='load_parquet'",
                    DeprecationWarning,
                    stacklevel=2,
                )
            else:
                warnings.warn(
                    "chunksize will be ignored when using api_method='load_csv' in a future version of pandas-gbq",
                    PendingDeprecationWarning,
                    stacklevel=2,
                )

        if if_exists not in ("fail", "replace", "append"):
            raise ValueError("'{0}' is not valid for if_exists".format(if_exists))

        if "." not in destination_table:
            raise NotFoundException(
                "Invalid Table Name. Should be of the form 'datasetId.tableId' or "
                "'projectId.datasetId.tableId'"
            )

        connector = GbqConnector(
            project_id,
            reauth=reauth,
            auth_local_webserver=auth_local_webserver,
            location=location,
            credentials=credentials,
            private_key=private_key,
        )
        bqclient = connector.client

        destination_table_ref = bigquery.table.TableReference.from_string(
            destination_table, default_project=connector.project_id
        )

        project_id_table = destination_table_ref.project
        dataset_id = destination_table_ref.dataset_id
        table_id = destination_table_ref.table_id

        default_schema = _generate_bq_schema(dataframe)
        if not table_schema:
            table_schema = default_schema
        else:
            table_schema = pandas_gbq.schema.update_schema(
                default_schema, dict(fields=table_schema)
            )

        # If table exists, check if_exists parameter
        try:
>           table = bqclient.get_table(destination_table_ref)

pandas_gbq/gbq.py:1149: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = 
table = TableReference(DatasetReference('precise-truck-742', 'python_bigquery_pandas_tests_system_20220319131718_dafd1b'), 'round_trip_829808')
retry = , timeout = None

    def get_table(
        self,
        table: Union[Table, TableReference, TableListItem, str],
        retry: retries.Retry = DEFAULT_RETRY,
        timeout: TimeoutType = DEFAULT_TIMEOUT,
    ) -> Table:
        """Fetch the table referenced by ``table``.

        Args:
            table (Union[ \
                google.cloud.bigquery.table.Table, \
                google.cloud.bigquery.table.TableReference, \
                google.cloud.bigquery.table.TableListItem, \
                str, \
            ]):
                A reference to the table to fetch from the BigQuery API.
                If a string is passed in, this method attempts to create a
                table reference from a string using
                :func:`google.cloud.bigquery.table.TableReference.from_string`.
            retry (Optional[google.api_core.retry.Retry]):
                How to retry the RPC.
            timeout (Optional[float]):
                The number of seconds to wait for the underlying HTTP transport
                before using ``retry``.

        Returns:
            google.cloud.bigquery.table.Table:
                A ``Table`` instance.
        """
        table_ref = _table_arg_to_table_ref(table, default_project=self.project)
        path = table_ref.path
        span_attributes = {"path": path}
>       api_response = self._call_api(
            retry,
            span_name="BigQuery.getTable",
            span_attributes=span_attributes,
            method="GET",
            path=path,
            timeout=timeout,
        )

.nox/system-3-9/lib/python3.9/site-packages/google/cloud/bigquery/client.py:1034: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = 
retry = 
span_name = 'BigQuery.getTable'
span_attributes = {'path': '/projects/precise-truck-742/datasets/python_bigquery_pandas_tests_system_20220319131718_dafd1b/tables/round_trip_829808'}
job_ref = None, headers = None
kwargs = {'method': 'GET', 'path': '/projects/precise-truck-742/datasets/python_bigquery_pandas_tests_system_20220319131718_dafd1b/tables/round_trip_829808', 'timeout': None}
call = .retry_wrapped_func at 0x7fdecd65b790>

    def _call_api(
        self,
        retry,
        span_name=None,
        span_attributes=None,
        job_ref=None,
        headers: Optional[Dict[str, str]] = None,
        **kwargs,
    ):
        kwargs = _add_server_timeout_header(headers, kwargs)
        call = functools.partial(self._connection.api_request, **kwargs)

        if retry:
            call = retry(call)

        if span_name is not None:
            with create_span(
                name=span_name, attributes=span_attributes, client=self, job_ref=job_ref
            ):
>               return call()

.nox/system-3-9/lib/python3.9/site-packages/google/cloud/bigquery/client.py:782: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = (), kwargs = {}
target = functools.partial(functools.partial(

    @functools.wraps(func)
    def retry_wrapped_func(*args, **kwargs):
        """A wrapper that calls target function with retry."""
        target = functools.partial(func, *args, **kwargs)
        sleep_generator = exponential_sleep_generator(
            self._initial, self._maximum, multiplier=self._multiplier
        )
>       return retry_target(
            target,
            self._predicate,
            sleep_generator,
            self._deadline,
            on_error=on_error,
        )

.nox/system-3-9/lib/python3.9/site-packages/google/api_core/retry.py:283: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

target = functools.partial(functools.partial(
sleep_generator = 
deadline = 600.0, on_error = None

    def retry_target(target, predicate, sleep_generator, deadline, on_error=None):
        """Call a function and retry if it fails.

        This is the lowest-level retry helper. Generally, you'll use the
        higher-level retry helper :class:`Retry`.

        Args:
            target(Callable): The function to call and retry. This must be a
                nullary function - apply arguments with `functools.partial`.
            predicate (Callable[Exception]): A callable used to determine if an
                exception raised by the target should be considered retryable.
                It should return True to retry or False otherwise.
            sleep_generator (Iterable[float]): An infinite iterator that determines
                how long to sleep between retries.
            deadline (float): How long to keep retrying the target. The last sleep
                period is shortened as necessary, so that the last retry runs at
                ``deadline`` (and not considerably beyond it).
            on_error (Callable[Exception]): A function to call while processing a
                retryable exception.  Any error raised by this function will *not*
                be caught.

        Returns:
            Any: the return value of the target function.

        Raises:
            google.api_core.RetryError: If the deadline is exceeded while retrying.
            ValueError: If the sleep generator stops yielding values.
            Exception: If the target raises a method that isn't retryable.
        """
        if deadline is not None:
            deadline_datetime = datetime_helpers.utcnow() + datetime.timedelta(
                seconds=deadline
            )
        else:
            deadline_datetime = None

        last_exc = None

        for sleep in sleep_generator:
            try:
>               return target()

.nox/system-3-9/lib/python3.9/site-packages/google/api_core/retry.py:190: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = 
method = 'GET'
path = '/projects/precise-truck-742/datasets/python_bigquery_pandas_tests_system_20220319131718_dafd1b/tables/round_trip_829808'
query_params = None, data = None, content_type = None, headers = None
api_base_url = None, api_version = None, expect_json = True
_target_object = None, timeout = None

    def api_request(
        self,
        method,
        path,
        query_params=None,
        data=None,
        content_type=None,
        headers=None,
        api_base_url=None,
        api_version=None,
        expect_json=True,
        _target_object=None,
        timeout=_DEFAULT_TIMEOUT,
    ):
        """Make a request over the HTTP transport to the API.

        You shouldn't need to use this method, but if you plan to
        interact with the API using these primitives, this is the
        correct one to use.

        :type method: str
        :param method: The HTTP method name (ie, ``GET``, ``POST``, etc).
                       Required.

        :type path: str
        :param path: The path to the resource (ie, ``'/b/bucket-name'``).
                     Required.

        :type query_params: dict or list
        :param query_params: A dictionary of keys and values (or list of
                             key-value pairs) to insert into the query
                             string of the URL.

        :type data: str
        :param data: The data to send as the body of the request. Default is
                     the empty string.

        :type content_type: str
        :param content_type: The proper MIME type of the data provided. Default
                             is None.

        :type headers: dict
        :param headers: extra HTTP headers to be sent with the request.

        :type api_base_url: str
        :param api_base_url: The base URL for the API endpoint.
                             Typically you won't have to provide this.
                             Default is the standard API base URL.

        :type api_version: str
        :param api_version: The version of the API to call.  Typically
                            you shouldn't provide this and instead use
                            the default for the library.  Default is the
                            latest API version supported by
                            google-cloud-python.

        :type expect_json: bool
        :param expect_json: If True, this method will try to parse the
                            response as JSON and raise an exception if
                            that cannot be done.  Default is True.

        :type _target_object: :class:`object`
        :param _target_object:
            (Optional) Protected argument to be used by library callers. This
            can allow custom behavior, for example, to defer an HTTP request
            and complete initialization of the object at a later time.

        :type timeout: float or tuple
        :param timeout: (optional) The amount of time, in seconds, to wait
            for the server response.

            Can also be passed as a tuple (connect_timeout, read_timeout).
            See :meth:`requests.Session.request` documentation for details.

        :raises ~google.cloud.exceptions.GoogleCloudError: if the response code
            is not 200 OK.
        :raises ValueError: if the response content type is not JSON.
        :rtype: dict or str
        :returns: The API response payload, either as a raw string or
                  a dictionary if the response is valid JSON.
        """
        url = self.build_api_url(
            path=path,
            query_params=query_params,
            api_base_url=api_base_url,
            api_version=api_version,
        )

        # Making the executive decision that any dictionary
        # data will be sent properly as JSON.
        if data and isinstance(data, dict):
            data = json.dumps(data)
            content_type = "application/json"

        response = self._make_request(
            method=method,
            url=url,
            data=data,
            content_type=content_type,
            headers=headers,
            target_object=_target_object,
            timeout=timeout,
        )

        if not 200 <= response.status_code < 300:
>           raise exceptions.from_http_response(response)
E           google.api_core.exceptions.NotFound: 404 GET https://bigquery.googleapis.com/bigquery/v2/projects/precise-truck-742/datasets/python_bigquery_pandas_tests_system_20220319131718_dafd1b/tables/round_trip_829808?prettyPrint=false: Not found: Dataset precise-truck-742:python_bigquery_pandas_tests_system_20220319131718_dafd1b

.nox/system-3-9/lib/python3.9/site-packages/google/cloud/_http/__init__.py:480: NotFound

During handling of the above exception, another exception occurred:

self = 
table_id = 'round_trip_829808'
schema = {'fields': [{'name': 'test_col', 'type': 'FLOAT'}, {'name': 'test_col2', 'type': 'FLOAT'}]}

    def create(self, table_id, schema):
        """Create a table in Google BigQuery given a table and schema

        Parameters
        ----------
        table : str
            Name of table to be written
        schema : str
            Use the generate_bq_schema to generate your table schema from a
            dataframe.
        """
        from google.cloud.bigquery import DatasetReference
        from google.cloud.bigquery import Table
        from google.cloud.bigquery import TableReference

        if self.exists(table_id):
            raise TableCreationError("Table {0} already exists".format(table_id))

        if not _Dataset(self.project_id, credentials=self.credentials).exists(
            self.dataset_id
        ):
            _Dataset(
                self.project_id, credentials=self.credentials, location=self.location,
            ).create(self.dataset_id)

        table_ref = TableReference(
            DatasetReference(self.project_id, self.dataset_id), table_id
        )
        table = Table(table_ref)
        table.schema = pandas_gbq.schema.to_google_cloud_bigquery(schema)

        try:
>           self.client.create_table(table)

pandas_gbq/gbq.py:1322: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = 
table = Table(TableReference(DatasetReference('precise-truck-742', 'python_bigquery_pandas_tests_system_20220319131718_dafd1b'), 'round_trip_829808'))
exists_ok = False
retry = , timeout = None

    def create_table(
        self,
        table: Union[str, Table, TableReference, TableListItem],
        exists_ok: bool = False,
        retry: retries.Retry = DEFAULT_RETRY,
        timeout: TimeoutType = DEFAULT_TIMEOUT,
    ) -> Table:
        """API call:  create a table via a PUT request

        See
        https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/insert

        Args:
            table (Union[ \
                google.cloud.bigquery.table.Table, \
                google.cloud.bigquery.table.TableReference, \
                google.cloud.bigquery.table.TableListItem, \
                str, \
            ]):
                A :class:`~google.cloud.bigquery.table.Table` to create.
                If ``table`` is a reference, an empty table is created
                with the specified ID. The dataset that the table belongs to
                must already exist.
            exists_ok (Optional[bool]):
                Defaults to ``False``. If ``True``, ignore "already exists"
                errors when creating the table.
            retry (Optional[google.api_core.retry.Retry]):
                How to retry the RPC.
            timeout (Optional[float]):
                The number of seconds to wait for the underlying HTTP transport
                before using ``retry``.

        Returns:
            google.cloud.bigquery.table.Table:
                A new ``Table`` returned from the service.

        Raises:
            google.cloud.exceptions.Conflict:
                If the table already exists.
        """
        table = _table_arg_to_table(table, default_project=self.project)
        dataset_id = table.dataset_id
        path = "/projects/%s/datasets/%s/tables" % (table.project, dataset_id)
        data = table.to_api_repr()
        try:
            span_attributes = {"path": path, "dataset_id": dataset_id}
>           api_response = self._call_api(
                retry,
                span_name="BigQuery.createTable",
                span_attributes=span_attributes,
                method="POST",
                path=path,
                data=data,
                timeout=timeout,
            )

.nox/system-3-9/lib/python3.9/site-packages/google/cloud/bigquery/client.py:748: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = 
retry = 
span_name = 'BigQuery.createTable'
span_attributes = {'dataset_id': 'python_bigquery_pandas_tests_system_20220319131718_dafd1b', 'path': '/projects/precise-truck-742/datasets/python_bigquery_pandas_tests_system_20220319131718_dafd1b/tables'}
job_ref = None, headers = None
kwargs = {'data': {'labels': {}, 'schema': {'fields': [{'mode': 'NULLABLE', 'name': 'test_col', 'type': 'FLOAT'}, {'mode': 'NUL...projects/precise-truck-742/datasets/python_bigquery_pandas_tests_system_20220319131718_dafd1b/tables', 'timeout': None}
call = .retry_wrapped_func at 0x7fdecece7ca0>

    def _call_api(
        self,
        retry,
        span_name=None,
        span_attributes=None,
        job_ref=None,
        headers: Optional[Dict[str, str]] = None,
        **kwargs,
    ):
        kwargs = _add_server_timeout_header(headers, kwargs)
        call = functools.partial(self._connection.api_request, **kwargs)

        if retry:
            call = retry(call)

        if span_name is not None:
            with create_span(
                name=span_name, attributes=span_attributes, client=self, job_ref=job_ref
            ):
>               return call()

.nox/system-3-9/lib/python3.9/site-packages/google/cloud/bigquery/client.py:782: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = (), kwargs = {}
target = functools.partial(functools.partial(

    @functools.wraps(func)
    def retry_wrapped_func(*args, **kwargs):
        """A wrapper that calls target function with retry."""
        target = functools.partial(func, *args, **kwargs)
        sleep_generator = exponential_sleep_generator(
            self._initial, self._maximum, multiplier=self._multiplier
        )
>       return retry_target(
            target,
            self._predicate,
            sleep_generator,
            self._deadline,
            on_error=on_error,
        )

.nox/system-3-9/lib/python3.9/site-packages/google/api_core/retry.py:283: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

target = functools.partial(functools.partial(
sleep_generator = 
deadline = 600.0, on_error = None

    def retry_target(target, predicate, sleep_generator, deadline, on_error=None):
        """Call a function and retry if it fails.

        This is the lowest-level retry helper. Generally, you'll use the
        higher-level retry helper :class:`Retry`.

        Args:
            target(Callable): The function to call and retry. This must be a
                nullary function - apply arguments with `functools.partial`.
            predicate (Callable[Exception]): A callable used to determine if an
                exception raised by the target should be considered retryable.
                It should return True to retry or False otherwise.
            sleep_generator (Iterable[float]): An infinite iterator that determines
                how long to sleep between retries.
            deadline (float): How long to keep retrying the target. The last sleep
                period is shortened as necessary, so that the last retry runs at
                ``deadline`` (and not considerably beyond it).
            on_error (Callable[Exception]): A function to call while processing a
                retryable exception.  Any error raised by this function will *not*
                be caught.

        Returns:
            Any: the return value of the target function.

        Raises:
            google.api_core.RetryError: If the deadline is exceeded while retrying.
            ValueError: If the sleep generator stops yielding values.
            Exception: If the target raises a method that isn't retryable.
        """
        if deadline is not None:
            deadline_datetime = datetime_helpers.utcnow() + datetime.timedelta(
                seconds=deadline
            )
        else:
            deadline_datetime = None

        last_exc = None

        for sleep in sleep_generator:
            try:
>               return target()

.nox/system-3-9/lib/python3.9/site-packages/google/api_core/retry.py:190: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = 
method = 'POST'
path = '/projects/precise-truck-742/datasets/python_bigquery_pandas_tests_system_20220319131718_dafd1b/tables'
query_params = None
data = '{"tableReference": {"projectId": "precise-truck-742", "datasetId": "python_bigquery_pandas_tests_system_2022031913171...name": "test_col", "type": "FLOAT", "mode": "NULLABLE"}, {"name": "test_col2", "type": "FLOAT", "mode": "NULLABLE"}]}}'
content_type = 'application/json', headers = None, api_base_url = None
api_version = None, expect_json = True, _target_object = None, timeout = None

    def api_request(
        self,
        method,
        path,
        query_params=None,
        data=None,
        content_type=None,
        headers=None,
        api_base_url=None,
        api_version=None,
        expect_json=True,
        _target_object=None,
        timeout=_DEFAULT_TIMEOUT,
    ):
        """Make a request over the HTTP transport to the API.

        You shouldn't need to use this method, but if you plan to
        interact with the API using these primitives, this is the
        correct one to use.

        :type method: str
        :param method: The HTTP method name (ie, ``GET``, ``POST``, etc).
                       Required.

        :type path: str
        :param path: The path to the resource (ie, ``'/b/bucket-name'``).
                     Required.

        :type query_params: dict or list
        :param query_params: A dictionary of keys and values (or list of
                             key-value pairs) to insert into the query
                             string of the URL.

        :type data: str
        :param data: The data to send as the body of the request. Default is
                     the empty string.

        :type content_type: str
        :param content_type: The proper MIME type of the data provided. Default
                             is None.

        :type headers: dict
        :param headers: extra HTTP headers to be sent with the request.

        :type api_base_url: str
        :param api_base_url: The base URL for the API endpoint.
                             Typically you won't have to provide this.
                             Default is the standard API base URL.

        :type api_version: str
        :param api_version: The version of the API to call.  Typically
                            you shouldn't provide this and instead use
                            the default for the library.  Default is the
                            latest API version supported by
                            google-cloud-python.

        :type expect_json: bool
        :param expect_json: If True, this method will try to parse the
                            response as JSON and raise an exception if
                            that cannot be done.  Default is True.

        :type _target_object: :class:`object`
        :param _target_object:
            (Optional) Protected argument to be used by library callers. This
            can allow custom behavior, for example, to defer an HTTP request
            and complete initialization of the object at a later time.

        :type timeout: float or tuple
        :param timeout: (optional) The amount of time, in seconds, to wait
            for the server response.

            Can also be passed as a tuple (connect_timeout, read_timeout).
            See :meth:`requests.Session.request` documentation for details.

        :raises ~google.cloud.exceptions.GoogleCloudError: if the response code
            is not 200 OK.
        :raises ValueError: if the response content type is not JSON.
        :rtype: dict or str
        :returns: The API response payload, either as a raw string or
                  a dictionary if the response is valid JSON.
        """
        url = self.build_api_url(
            path=path,
            query_params=query_params,
            api_base_url=api_base_url,
            api_version=api_version,
        )

        # Making the executive decision that any dictionary
        # data will be sent properly as JSON.
        if data and isinstance(data, dict):
            data = json.dumps(data)
            content_type = "application/json"

        response = self._make_request(
            method=method,
            url=url,
            data=data,
            content_type=content_type,
            headers=headers,
            target_object=_target_object,
            timeout=timeout,
        )

        if not 200 <= response.status_code < 300:
>           raise exceptions.from_http_response(response)
E           google.api_core.exceptions.Forbidden: 403 POST https://bigquery.googleapis.com/bigquery/v2/projects/precise-truck-742/datasets/python_bigquery_pandas_tests_system_20220319131718_dafd1b/tables?prettyPrint=false: Access Denied: Dataset precise-truck-742:python_bigquery_pandas_tests_system_20220319131718_dafd1b: Permission bigquery.tables.create denied on dataset precise-truck-742:python_bigquery_pandas_tests_system_20220319131718_dafd1b (or it may not exist).

.nox/system-3-9/lib/python3.9/site-packages/google/cloud/_http/__init__.py:480: Forbidden

During handling of the above exception, another exception occurred:

method_under_test = functools.partial(, project_id='precise-truck-742', credentials=)
random_dataset_id = 'python_bigquery_pandas_tests_system_20220319131718_dafd1b'
read_gbq = functools.partial(, project_id='precise-truck-742', credentials=)
input_series = 0    1.428571e-01
1    4.406780e-01
2    1.051480e+00
3    1.051530e+00
4    1.857143e+00
5    2.718282e+00
6    3.141593e+00
7    2.098894e+43
Name: test_col, dtype: float64
api_method = 'load_parquet', api_methods = {'load_csv', 'load_parquet'}

    @pytest.mark.parametrize(
        ["input_series", "api_methods"],
        [
            # Ensure that 64-bit floating point numbers are unchanged.
            # See: https://github.com/pydata/pandas-gbq/issues/326
            SeriesRoundTripTestCase(
                input_series=pandas.Series(
                    [
                        0.14285714285714285,
                        0.4406779661016949,
                        1.05148,
                        1.05153,
                        1.8571428571428572,
                        2.718281828459045,
                        3.141592653589793,
                        2.0988936657440586e43,
                    ],
                    name="test_col",
                ),
            ),
            SeriesRoundTripTestCase(
                input_series=pandas.Series(
                    [
                        "abc",
                        "defg",
                        # Ensure that unicode characters are encoded. See:
                        # https://github.com/googleapis/python-bigquery-pandas/issues/106
                        "信用卡",
                        "Skywalker™",
                        "hülle",
                    ],
                    name="test_col",
                ),
            ),
            SeriesRoundTripTestCase(
                input_series=pandas.Series(
                    [
                        "abc",
                        "defg",
                        # Ensure that empty strings are written as empty string,
                        # not NULL. See:
                        # https://github.com/googleapis/python-bigquery-pandas/issues/366
                        "",
                        None,
                    ],
                    name="empty_strings",
                ),
                # BigQuery CSV loader uses empty string as the "null marker" by
                # default. Potentially one could choose a rarely used character or
                # string as the null marker to disambiguate null from empty string,
                # but then that string couldn't be loaded.
                # TODO: Revist when custom load job configuration is supported.
                #       https://github.com/googleapis/python-bigquery-pandas/issues/425
                api_methods={"load_parquet"},
            ),
        ],
    )
    def test_series_round_trip(
        method_under_test,
        random_dataset_id,
        read_gbq,
        input_series,
        api_method,
        api_methods,
    ):
        if api_method not in api_methods:
            pytest.skip(f"{api_method} not supported.")
        table_id = f"{random_dataset_id}.round_trip_{random.randrange(1_000_000)}"
        input_series = input_series.sort_values().reset_index(drop=True)
        df = pandas.DataFrame(
            # Some errors only occur in multi-column dataframes. See:
            # https://github.com/googleapis/python-bigquery-pandas/issues/366
            {"test_col": input_series, "test_col2": input_series}
        )
>       method_under_test(df, table_id, api_method=api_method)

tests/system/test_to_gbq.py:110: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pandas_gbq/gbq.py:1157: in to_gbq
    table_connector.create(table_id, table_schema)
pandas_gbq/gbq.py:1324: in create
    self.process_http_error(ex)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

ex = Forbidden('POST https://bigquery.googleapis.com/bigquery/v2/projects/precise-truck-742/datasets/python_bigquery_pandas... denied on dataset precise-truck-742:python_bigquery_pandas_tests_system_20220319131718_dafd1b (or it may not exist).')

    @staticmethod
    def process_http_error(ex):
        # See `BigQuery Troubleshooting Errors
        # `__

        if "cancelled" in ex.message:
            raise QueryTimeout("Reason: {0}".format(ex))

>       raise GenericGBQException("Reason: {0}".format(ex))
E       pandas_gbq.exceptions.GenericGBQException: Reason: 403 POST https://bigquery.googleapis.com/bigquery/v2/projects/precise-truck-742/datasets/python_bigquery_pandas_tests_system_20220319131718_dafd1b/tables?prettyPrint=false: Access Denied: Dataset precise-truck-742:python_bigquery_pandas_tests_system_20220319131718_dafd1b: Permission bigquery.tables.create denied on dataset precise-truck-742:python_bigquery_pandas_tests_system_20220319131718_dafd1b (or it may not exist).

pandas_gbq/gbq.py:386: GenericGBQException
flaky-bot[bot] commented 2 years ago

Looks like this issue is flaky. :worried:

I'm going to leave this open and stop commenting.

A human should fix and close this.


When run at the same commit (ae6ad6b38c5f2230820b7935080a64f499e1c11b), this test passed in one build (Build Status, Sponge) and failed in another build (Build Status, Sponge).