exasol / cloud-storage-extension

Exasol Cloud Storage Extension for accessing formatted data Avro, Orc and Parquet, on public cloud storage systems
MIT License
7 stars 11 forks source link

Bug in Exasol Cloud Extension when Importing S3 #227

Closed zfmio closed 1 year ago

zfmio commented 1 year ago

By using the Exasol Cloud Extension to Import S3 data to Exasol I'm always getting the same error message (see below).

My Table doesn't have a max of 200 characters and the dataset on s3 doesn't contain any data exceeding this 200 character limit. I was able to upload the dataset through a JDBC connector from spark without any issues and that is why I assume the issue might be with the connector.

The error message is: org.jkiss.dbeaver.model.sql.DBSQLException: SQL-Fehler [22002]: VM error: data exception - string data, right truncation: max length: 200, emitted: 201 (Session: ) at org.jkiss.dbeaver.model.impl.jdbc.exec.JDBCStatementImpl.executeStatement(JDBCStatementImpl.java:133) at org.jkiss.dbeaver.ui.editors.sql.execute.SQLQueryJob.executeStatement(SQLQueryJob.java:577) at org.jkiss.dbeaver.ui.editors.sql.execute.SQLQueryJob.lambda$1(SQLQueryJob.java:486) at org.jkiss.dbeaver.model.exec.DBExecUtils.tryExecuteRecover(DBExecUtils.java:172) at org.jkiss.dbeaver.ui.editors.sql.execute.SQLQueryJob.executeSingleQuery(SQLQueryJob.java:493) at org.jkiss.dbeaver.ui.editors.sql.execute.SQLQueryJob.extractData(SQLQueryJob.java:894) at org.jkiss.dbeaver.ui.editors.sql.SQLEditor$QueryResultsContainer.readData(SQLEditor.java:3645) at org.jkiss.dbeaver.ui.controls.resultset.ResultSetJobDataRead.lambda$0(ResultSetJobDataRead.java:123) at org.jkiss.dbeaver.model.exec.DBExecUtils.tryExecuteRecover(DBExecUtils.java:172) at org.jkiss.dbeaver.ui.controls.resultset.ResultSetJobDataRead.run(ResultSetJobDataRead.java:121) at org.jkiss.dbeaver.ui.controls.resultset.ResultSetViewer$ResultSetDataPumpJob.run(ResultSetViewer.java:4949) at org.jkiss.dbeaver.model.runtime.AbstractJob.run(AbstractJob.java:105) at org.eclipse.core.internal.jobs.Worker.run(Worker.java:63) Caused by: java.sql.SQLDataException: VM error: data exception - string data, right truncation: max length: 200, emitted: 201 (Session: 1747208403672170499) at com.exasol.jdbc.ExceptionFactory.createSQLException(ExceptionFactory.java:33) at com.exasol.jdbc.EXASQLException.getSQLExceptionIntern(EXASQLException.java:50) at com.exasol.jdbc.AbstractEXAStatement.execute(AbstractEXAStatement.java:478) at com.exasol.jdbc.EXAStatement.execute(EXAStatement.java:289) at org.jkiss.dbeaver.model.impl.jdbc.exec.JDBCStatementImpl.execute(JDBCStatementImpl.java:329) at org.jkiss.dbeaver.model.impl.jdbc.exec.JDBCStatementImpl.lambda$0(JDBCStatementImpl.java:131) at org.jkiss.dbeaver.utils.SecurityManagerUtils.wrapDriverActions(SecurityManagerUtils.java:94) at org.jkiss.dbeaver.model.impl.jdbc.exec.JDBCStatementImpl.executeStatement(JDBCStatementImpl.java:131) ... 12 more

The DDL for the table I'm trying to import into is: CREATE TABLE Schema.A_table ( Col1 VARCHAR(100) UTF8, Col2 VARCHAR(11) UTF8, Col3 DECIMAL(18,0), Col4 VARCHAR(8) UTF8, Date_Col1 DATE, Date_Col2 DATE, Col5 DECIMAL(18,0) );

Please let me know if you need any further info.

morazow commented 1 year ago

Hello @zfmio,

Thanks for the feedback!

We are going to check what might be the issue as discussed in the EXA-42414.

morazow commented 1 year ago

The solution was to update filename size in the IMPORT_METADATA script.

CREATE OR REPLACE JAVA SCALAR SCRIPT IMPORT_METADATA(...)
EMITS (filename VARCHAR(200), partition_index VARCHAR(100)) AS
  %scriptclass com.exasol.cloudetl.scriptclasses.ImportMetadata;
  %jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-<VERSION>.jar;
/ 

Solution is to increase the filename size from 200 to larger size.