darold / ora2pg

Ora2Pg is a free tool used to migrate an Oracle database to a PostgreSQL compatible schema. It connects your Oracle database, scan it automatically and extracts its structure or data, it then generates SQL scripts that you can load into PostgreSQL.
http://www.ora2pg.com/
GNU General Public License v3.0
978 stars 341 forks source link

[ORACLE]Slow export of BLOB and CLOB from #1781

Open DonShamsan opened 1 month ago

DonShamsan commented 1 month ago

VM specification:

vCPUs 4 RAM 16 GiB

Table details:

myTable size: 4 GB Blob.

Export duration: 8 hours.

Max rows per second reached was 75 Command: ora2pg -t COPY -o data.sql -b ./data -c ./config/ora2pg.conf

Invistigation:

  1. Run the query sent by ora2pg to oracle manually: it took milliseconds

  2. Network speed between ora2pg and the server test by iperf3:

    image
  3. Check iops on the vm during the export: max iops reached was 7%

  4. RAM and CPU usage less than 15%

I've also tried with the configuration mentioned in the benchmark (DATA_LIMIT=100, LONGREADLEN=170Mb) and kept changing the value of these parameters

List of other related parameters are listed in PERFORMANCES SECTION and DATA SECTION

Would please advise on how to speedup the export?

ora2pg.conf:

INPUT SECTION (Oracle connection or input file)

USER_GRANTS 1 DEBUG 1

SCHEMA SECTION

EXPORT_SCHEMA 0 SCHEMA mySchema CREATE_SCHEMA 1 COMPILE_SCHEMA 1 PG_SCHEMA mySchema ALLOW myTable NO_FUNCTION_METADATA 0

ENCODING SECTION

FORCE_PLSQL_ENCODING 0

EXPORT SECTION

TYPE TABLE DISABLE_COMMENT 0 NO_EXCLUDED_TABLE 0 NO_VIEW_ORDERING 0 EXTERNAL_TO_FDW 1 TRUNCATE_TABLE 0 USE_TABLESPACE 0 REORDERING_COLUMNS 0 DROP_IF_EXISTS 0 EXPORT_GTT 0

FULL TEXT SEARCH SECTION

CONTEXT_AS_TRGM 0 FTS_INDEX_ONLY 1 USE_UNACCENT 0 USE_LOWER_UNACCENT 0

DATA DIFF SECTION

DATADIFF 0 DATADIFF_UPDATE_BY_PKEY 0 DATADIFF_DEL_SUFFIX _del DATADIFF_UPD_SUFFIX _upd DATADIFF_INS_SUFFIX _ins DATADIFF_WORK_MEM 1024 MB DATADIFF_TEMP_BUFFERS 2048 MB

CONSTRAINT SECTION

KEEP_PKEY_NAMES 1 PKEY_IN_CREATE 1 FKEY_ADD_UPDATE never FKEY_DEFERRABLE 0 DEFER_FKEY 0 DROP_FKEY 1

TRIGGERS AND SEQUENCES SECTION

DISABLE_SEQUENCE 1 DISABLE_TRIGGERS 1

OBJECT MODIFICATION SECTION

PRESERVE_CASE 0 INDEXES_SUFFIX _idx INDEXES_RENAMING 1 USE_INDEX_OPCLASS 0 RENAME_PARTITION 0 DISABLE_PARTITION 0 WITH_OID 0 ORA_RESERVED_WORDS audit,comment,references USE_RESERVED_WORDS 0 DISABLE_UNLOGGED 1 DOUBLE_MAX_VARCHAR 0

OUTPUT SECTION (Control output to file or PostgreSQL database)

OUTPUT output.sql OUTPUT_DIR /path/to/project/blob BZIP2 FILE_PER_CONSTRAINT 1 FILE_PER_INDEX 1 FILE_PER_FKEYS 1 FILE_PER_TABLE 1 FILE_PER_FUNCTION 1 STOP_ON_ERROR 1 COPY_FREEZE 0 CREATE_OR_REPLACE 1 INSERT_ON_CONFLICT 0

TYPE SECTION (Control type behaviors and redefinitions)

PG_NUMERIC_TYPE 1 PG_INTEGER_TYPE 1 DEFAULT_NUMERIC numeric ENABLE_MICROSECOND 1 TO_NUMBER_CONVERSION numeric VARCHAR_TO_TEXT 1 FORCE_IDENTITY_BIGINT 1 TO_CHAR_NOTIMEZONE 1

GRANT SECTION (Control priviledge and owner export)

GEN_USER_PWD 0 FORCE_OWNER 0 FORCE_SECURITY_INVOKER 0

DATA SECTION (Control data export behaviors)

I've also tried with the configuration mentioned in the benchmark (DATA_LIMIT=100, LONGREADLEN=170Mb)

DATA_LIMIT 1000 -- tried with different numbers starting from 100 BLOB_LIMIT 1000 -- tried with different numbers starting from 100 CLOB_AS_BLOB 1 NOESCAPE 0 TRANSACTION committed STANDARD_CONFORMING_STRINGS 1 LONGREADLEN 1024Mb -- tried with different numbers starting from 170Mb USE_LOB_LOCATOR 0 LOB_CHUNK_SIZE 128Mb -- tried with different numbers starting from 5 Mb XML_PRETTY 0 LOG_ON_ERROR 1 TRIM_TYPE BOTH INTERNAL_DATE_MAX 49 FUNCTION_CHECK 1 ENABLE_BLOB_EXPORT 1 ENABLE_CLOB_EXPORT 1 DATA_EXPORT_ORDER name PSQL_RELATIVE_PATH 0 DATA_VALIDATION_ROWS 1000 DATA_VALIDATION_ORDERING 1 DATA_VALIDATION_ERROR 1

PERFORMANCES SECTION (Control export/import performances)

JOBS 5 -- Disbled as it's for data import, and the slowness is with data export. ORACLE_COPIES 4 DEFINED_PKEY myTable:ID DEFAULT_PARALLELISM_DEGREE 10 -- tried with different numbers starting from 0 PARALLEL_MIN_ROWS 50 DROP_INDEXES 1 SYNCHRONOUS_COMMIT 0

PLSQL SECTION (Control SQL and PL/SQL to PLPGSQL rewriting behaviors)

EXPORT_INVALID 0 PLSQL_PGSQL 0 NULL_EQUAL_EMPTY 0 EMPTY_LOB_NULL 1 PACKAGE_AS_SCHEMA 1 REWRITE_OUTER_JOIN 1 FUNCTION_STABLE 1 COMMENT_COMMIT_ROLLBACK 0 COMMENT_SAVEPOINT 0 USE_ORAFCE 0 AUTONOMOUS_TRANSACTION 1

ASSESSMENT SECTION

ESTIMATE_COST 0 COST_UNIT_VALUE 5 DUMP_AS_HTML 0 TOP_MAX 10 HUMAN_DAYS_LIMIT 5

POSTGRESQL FEATURE SECTION

PG_VERSION 15 BITMAP_AS_GIN 1 PG_BACKGROUND 0 PG_SUPPORTS_SUBSTR 1

SPATIAL SECTION (Control spatial geometry export)

AUTODETECT_SPATIAL_TYPE 1 CONVERT_SRID 1 DEFAULT_SRID 4326 GEOMETRY_EXTRACT_TYPE INTERNAL ST_SRID_FUNCTION ST_SRID ST_DIMENSION_FUNCTION ST_DIMENSION ST_ASBINARY_FUNCTION ST_ASBINARY ST_ASTEXT_FUNCTION ST_ASTEXT ST_GEOMETRYTYPE_FUNCTION ST_GEOMETRYTYPE FDW_IMPORT_SCHEMA ora2pg_fdw_import DROP_FOREIGN_SCHEMA 1

darold commented 1 month ago

Have you tried by enabling USE_LOB_LOCATOR? Also you are using ORACLE_COPY perhaps Ora2Pg is not able to find a unique key to split the data download from this table, can you run ora2Pg with the -d option and post the query used to download the data that will be printed?

DonShamsan commented 1 month ago

Have you tried by enabling USE_LOB_LOCATOR? Also you are using ORACLE_COPY perhaps Ora2Pg is not able to find a unique key to split the data download from this table, can you run ora2Pg with the -d option and post the query used to download the data that will be printed?

Thank you very much for the fast response!

Note the field is CLOB

From debug log

darold commented 1 month ago

I think you are not using Ora2Pg parallelism. What is the data type of column ID in Oracle?

darold commented 1 month ago

Are you running Ora2Pg on a Windows box?

DonShamsan commented 1 month ago

VM OS: Linux (RHEL 9.3).

What could be the wrong values in ora2pg.conf that prevent the use of oracle parallelism?

darold commented 1 month ago

the data type of your ID column

DonShamsan commented 1 month ago

Data type: NUMBER(20,0)