datacleaner / DataCleaner

The premier open source Data Quality solution
GNU Lesser General Public License v3.0
600 stars 181 forks source link

Multithreaded inserting of rows into SQL Server fails #1201

Closed LosD closed 8 years ago

LosD commented 8 years ago

Jobs that insert large amounts of rows (500000+) in an SQL Server table fails when running in normal (=multithreaded) mode, even if run with multiple connections disabled.

LosD commented 8 years ago

This may be an issue with the jTDS driver, at least uPortal has had some issues, requiring special settings to avoid problems. They are recommending MS' own driver.

At least it is probably a good idea to compare pros and cons.

ClaudiaPHI commented 8 years ago

The multi-threading insertion into SQL server fails if there is 1 connection to database.

Could not rollback transaction: rollback() should not be called while in auto-commit mode.

org.apache.metamodel.MetaModelException: Could not rollback transaction: rollback() should not be called while in auto-commit mode.
    at org.apache.metamodel.jdbc.JdbcUtils.wrapException(JdbcUtils.java:61)
    at org.apache.metamodel.jdbc.JdbcUpdateCallback.commitOrRollback(JdbcUpdateCallback.java:112)
    at org.apache.metamodel.jdbc.JdbcUpdateCallback.close(JdbcUpdateCallback.java:85)
    at org.apache.metamodel.jdbc.JdbcDataContext.executeUpdate(JdbcDataContext.java:850)
    at org.datacleaner.beans.writers.InsertIntoTableAnalyzer.run(InsertIntoTableAnalyzer.java:414)
    at org.datacleaner.beans.writers.InsertIntoTableAnalyzer.run(InsertIntoTableAnalyzer.java:76)
    at org.datacleaner.util.WriteBuffer.flushBuffer(WriteBuffer.java:84)
    at org.datacleaner.util.WriteBuffer.addToBuffer(WriteBuffer.java:60)
    at org.datacleaner.beans.writers.InsertIntoTableAnalyzer.run(InsertIntoTableAnalyzer.java:377)
    at org.datacleaner.job.runner.AnalyzerConsumer.consumeInternal(AnalyzerConsumer.java:71)
    at org.datacleaner.job.runner.AbstractRowProcessingConsumer.consume(AbstractRowProcessingConsumer.java:159)
    at org.datacleaner.job.runner.ConsumeRowHandlerDelegate.consume(ConsumeRowHandlerDelegate.java:64)
    at org.datacleaner.job.runner.ConsumeRowHandler.consumeRow(ConsumeRowHandler.java:146)
    at org.datacleaner.job.tasks.ConsumeRowTask.execute(ConsumeRowTask.java:51)
    at org.datacleaner.job.concurrent.TaskRunnable.run(TaskRunnable.java:61)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
org.apache.metamodel.MetaModelException: Could not get schema names: The statement is closed.
    at org.apache.metamodel.jdbc.JdbcUtils.wrapException(JdbcUtils.java:61)
    at org.apache.metamodel.jdbc.JdbcDataContext.getSchemaNamesInternal(JdbcDataContext.java:805)
    at org.apache.metamodel.AbstractDataContext.getSchemaNames(AbstractDataContext.java:110)
    at org.apache.metamodel.AbstractDataContext.getSchemaByName(AbstractDataContext.java:203)
    at org.datacleaner.connection.SchemaNavigator.getSchemaByName(SchemaNavigator.java:60)
    at org.datacleaner.connection.SchemaNavigator.convertToTable(SchemaNavigator.java:68)
    at org.datacleaner.connection.SchemaNavigator.convertToColumns(SchemaNavigator.java:106)
    at org.datacleaner.beans.writers.InsertIntoTableAnalyzer.run(InsertIntoTableAnalyzer.java:407)
    at org.datacleaner.beans.writers.InsertIntoTableAnalyzer.run(InsertIntoTableAnalyzer.java:76)
    at org.datacleaner.util.WriteBuffer.flushBuffer(WriteBuffer.java:84)
    at org.datacleaner.util.WriteBuffer.addToBuffer(WriteBuffer.java:60)
    at org.datacleaner.beans.writers.InsertIntoTableAnalyzer.run(InsertIntoTableAnalyzer.java:377)
    at org.datacleaner.job.runner.AnalyzerConsumer.consumeInternal(AnalyzerConsumer.java:71)
    at org.datacleaner.job.runner.AbstractRowProcessingConsumer.consume(AbstractRowProcessingConsumer.java:159)
    at org.datacleaner.job.runner.ConsumeRowHandlerDelegate.consume(ConsumeRowHandlerDelegate.java:64)
    at org.datacleaner.job.runner.ConsumeRowHandler.consumeRow(ConsumeRowHandler.java:146)
    at org.datacleaner.job.tasks.ConsumeRowTask.execute(ConsumeRowTask.java:51)
    at org.datacleaner.job.concurrent.TaskRunnable.run(TaskRunnable.java:61)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)