aliyun / aliyun-odps-python-sdk

ODPS Python SDK and data analysis framework
http://pyodps.readthedocs.io
Apache License 2.0
434 stars 97 forks source link

"decimal.InvalidOperation": 写入到本地mysql 数据表失败 #227

Closed raindust closed 6 months ago

raindust commented 8 months ago

我写了如下的数据表用来将 max compute 上的数据导入到本地的一个 mysql 数据表中

def export_sql(sql, table_name, project, mysql_host, o: ODPS):
    with o.execute_sql(sql).open_reader() as reader:
        df = reader.to_pandas()

    engine_url = mysql_host + '/' + project
    engine = create_engine(engine_url)

    try:
        if not database_exists(engine.url):
            create_database(engine.url)
    except exc.SQLAlchemyError:
        print(
            "Database creation failed. Please check your connection settings.")
        return
    df.to_sql(table_name, con=engine, if_exists='replace', index=False)

在执行的时候会报如下的错误:

Traceback (most recent call last):
  File "/home/raindust_x/github/creda/max_compute_exporter/main.py", line 58, in <module>
    main()
  File "/home/raindust_x/github/creda/max_compute_exporter/main.py", line 50, in main
    export_table(table_name=table_name,
  File "/home/raindust_x/github/creda/max_compute_exporter/exporter.py", line 25, in export_table
    export_sql(sql, table_name, project, mysql_host, o)
  File "/home/raindust_x/github/creda/max_compute_exporter/exporter.py", line 8, in export_sql
    df = reader.to_pandas()
  File "/home/raindust_x/miniconda3/envs/creda/lib/python3.10/site-packages/odps/models/readers.py", line 166, in to_pandas
    return super(TunnelRecordReader, self).to_pandas(
  File "/home/raindust_x/miniconda3/envs/creda/lib/python3.10/site-packages/odps/readers.py", line 202, in to_pandas
    return self.to_result_frame(start=start, count=count, **kw).values
  File "/home/raindust_x/miniconda3/envs/creda/lib/python3.10/site-packages/odps/readers.py", line 178, in to_result_frame
    for offset, rec in zip(
  File "/home/raindust_x/miniconda3/envs/creda/lib/python3.10/site-packages/odps/models/readers.py", line 158, in read
    for record in self._retry_iter_reader(
  File "/home/raindust_x/miniconda3/envs/creda/lib/python3.10/site-packages/odps/models/readers.py", line 80, in _retry_iter_reader
    for rec in self._open_and_iter_reader(
  File "/home/raindust_x/miniconda3/envs/creda/lib/python3.10/site-packages/odps/models/readers.py", line 143, in _open_and_iter_reader
    for record in reader[::step]:
  File "/home/raindust_x/miniconda3/envs/creda/lib/python3.10/site-packages/odps/readers.py", line 115, in _iter
    record = next(self)
  File "odps/tunnel/io/reader_c.pyx", line 459, in odps.tunnel.io.reader_c.TunnelRecordReader.__next__
  File "odps/tunnel/io/reader_c.pyx", line 358, in odps.tunnel.io.reader_c.BaseTunnelRecordReader.read
  File "odps/tunnel/io/reader_c.pyx", line 413, in odps.tunnel.io.reader_c.BaseTunnelRecordReader.read
  File "odps/tunnel/io/reader_c.pyx", line 333, in odps.tunnel.io.reader_c.BaseTunnelRecordReader._set_decimal
  File "odps/tunnel/io/reader_c.pyx", line 298, in odps.tunnel.io.reader_c.BaseTunnelRecordReader._set_record_list_value
  File "odps/src/types_c.pyx", line 280, in odps.src.types_c.SchemaSnapshot.validate_value
  File "odps/src/types_c.pyx", line 154, in odps.src.types_c._validate_decimal
decimal.InvalidOperation: [<class 'decimal.InvalidOperation'>]

分析 max compute 上的数据表,我认为是有一个类型为decimal(38,0)的字段在转换的时候报的错,一下是我从数据表中找到的一些样例数据:

1451520000000000000
483840000000000000
2903040000000000000
4838400000000000000
3386880000000000000
3386880000000000000
34836480000000000000
2903040000000000000
104509440000000000000
174182400000000000000
wjsi commented 6 months ago

Fixed in v0.11.6.