schemacrawler / SchemaCrawler

Free database schema discovery and comprehension tool
http://www.schemacrawler.com/
Other
1.62k stars 200 forks source link

Slow performance while retrieving foreign keys #10

Closed patric-r closed 9 years ago

patric-r commented 9 years ago

Hi,

SchemaCrawler seems to have a performance issue when retrieving foreign keys.

On our databases (oracle 11g), schemacrawler needs more than two hours to dump all tables on a very small database (650 MB in total; however many database objects exist):

select sum(BYTES/1024/1024) as TOTAL_MB from user_segments;
--returns 655 

select count(1) from all_cons_columns;
--returns 8633

select count(1) from all_constraints;
--returns 8541

select count(1) from tab;
--returns 1487

The following stacktrace shows where the most time is spent:

  java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:152)
        at java.net.SocketInputStream.read(SocketInputStream.java:122)
        at oracle.net.ns.Packet.receive(Packet.java:308)
        at oracle.net.ns.DataPacket.receive(DataPacket.java:106)
        at oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:324)
        at oracle.net.ns.NetInputStream.read(NetInputStream.java:268)
        at oracle.net.ns.NetInputStream.read(NetInputStream.java:190)
        at oracle.net.ns.NetInputStream.read(NetInputStream.java:107)
        at oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket(T4CSocketInputStreamWrapper.java:124)
        at oracle.jdbc.driver.T4CSocketInputStreamWrapper.read(T4CSocketInputStreamWrapper.java:80)
        at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1137)
        at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:350)
        at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:227)
        at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531)
        at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:208)
        at oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:886)
        at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1175)
        at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1296)
        at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3613)
        at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3657)
        - locked <0x0000000706fbe278> (a oracle.jdbc.driver.T4CConnection)
        at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:1495)
        at oracle.jdbc.OracleDatabaseMetaData.keys_query(OracleDatabaseMetaData.java:3731)
        at oracle.jdbc.OracleDatabaseMetaData.getExportedKeys(OracleDatabaseMetaData.java:3880)
        at schemacrawler.crawl.TableRetriever.retrieveForeignKeys(TableRetriever.java:173)
        at schemacrawler.crawl.SchemaCrawler.crawlTables(SchemaCrawler.java:443)
        at schemacrawler.crawl.SchemaCrawler.crawl(SchemaCrawler.java:553)
        at schemacrawler.tools.executable.BaseStagedExecutable.execute(BaseStagedExecutable.java:69)
        at schemacrawler.tools.commandline.SchemaCrawlerCommandLine.execute(SchemaCrawlerCommandLine.java:121)
        at schemacrawler.Main.main(Main.java:87)

The cause seems to be the "select all foreign keys"-query, which in my case takes 5 seconds to return. Because ~1500 tables need to be crawled, the select seems to be executed 1500 times so this will take approx. 1500 * 5 seconds = ~2 hours to complete.

A possible solution would be executing the "select all foreign keys"-query only once for all tables - not for each table.

schemacrawler commented 9 years ago

Interesting, since SchemaCrawler makes a JDBC call to retrieve foreign keys only for a given table at a time, not for all tables all the time. Would you be able to send me (sualeh@hotmail.com) a detailed log, using an additional command-line argument, -loglevel=all, so that I can see the Oracle trace? Thanks.

patric-r commented 9 years ago

I think you probably got me wrong: Yes, indeed schemacrawler seems to retrieve foreign keys only for a given table at a time. It does not fetch all foreign keys for each TableRetriever.retrieveForeignKeys() call.

But it's still an issue because TableRetriever.retrieveForeignKeys() takes 5 seconds to return in my case and because it's called 1500 times, foreign key retrieval takes 2 hours in total.

(FYI: I'm using the last java 7-enabled release of schemacrawler 12.06.03)

schemacrawler commented 9 years ago

Thanks, I will take a look. Just confirming: you do need all 1,500 tables in the output, right?

patric-r commented 9 years ago

Yes, that's right. The issue does also concern me if I access schemacrawler programmatically. Even if I only fetch ~100 tables, schemacrawler would need ~10 minutes to return.

Thanks for having a look.

patric-r commented 9 years ago

One additional remark: Most of the 1500 tables are empty in my case. The actual data dump finishes almost instantly. Only the foreign key definitions causes schemacrawler to slow down in magnitudes.

I'm using schemacrawler since version 6.4 and this has been an issue till the/my beginning, however seems to got worse with the recent major releases.

Nevertheless I'm very thankful for your tool. It is a great help.

schemacrawler commented 9 years ago

I will certainly take a look at it. I think in some earlier releases, I was doing exactly that - retrieving all foreign keys all at once. However, I plan to make fixes, if any, only on the latest version of SchemaCrawler.

schemacrawler commented 9 years ago

Patric, please try SchemaCrawler 14.03.03, and see if it is any faster for you. Please let me know. Thanks.

schemacrawler commented 9 years ago

Patric, is SchemaCrawler 14.03.03 working for you?

patric-r commented 9 years ago

@sualeh I'm sorry for the delay.

Using 14.03.03, things greatly improved: The time required for dumping all data of the above mentioned tiny schema reduced from more than 2 hours to ~5 minutes.

Foreign key retrieval is no longer a performance penalty. Thank you so much for your effort!

schemacrawler commented 9 years ago

Awesome! You can probably get some benefits by tuning the database too, such as updating metadata statistics.

patric-r commented 9 years ago

@sualeh While at the database on which I first saw this issue, schemacrawler still performs perfectly, I now have a similar issue on another database, which is more or less a clone of the original database. In this database, schemacrawler 14.03.03 still seems to suffer from a performance issue with foreign key retrieval, as the following stacktrace shows in which the tool seems to be stuck :

   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
        at java.net.SocketInputStream.read(SocketInputStream.java:170)
        at java.net.SocketInputStream.read(SocketInputStream.java:141)
        at oracle.net.ns.Packet.receive(Packet.java:308)
        at oracle.net.ns.DataPacket.receive(DataPacket.java:106)
        at oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:324)
        at oracle.net.ns.NetInputStream.read(NetInputStream.java:268)
        at oracle.net.ns.NetInputStream.read(NetInputStream.java:190)
        at oracle.net.ns.NetInputStream.read(NetInputStream.java:107)
        at oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket(T4CSocketInputStreamWrapper.java:124)
        at oracle.jdbc.driver.T4CSocketInputStreamWrapper.read(T4CSocketInputStreamWrapper.java:80)
        at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1137)
        at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:350)
        at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:227)
        at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531)
        at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:208)
        at oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:886)
        at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1175)
        at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1296)
        at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3613)
        at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3657)
        - locked <0x00000006c6e27e68> (a oracle.jdbc.driver.T4CConnection)
        at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:1495)
        at oracle.jdbc.OracleDatabaseMetaData.keys_query(OracleDatabaseMetaData.java:3731)
        at oracle.jdbc.OracleDatabaseMetaData.getExportedKeys(OracleDatabaseMetaData.java:3880)
        at schemacrawler.crawl.ForeignKeyRetriever.retrieveForeignKeysUsingDatabaseMetadata(ForeignKeyRetriever.java:258)
        at schemacrawler.crawl.ForeignKeyRetriever.retrieveForeignKeys(ForeignKeyRetriever.java:76)
        at schemacrawler.crawl.SchemaCrawler.crawlTables(SchemaCrawler.java:428)
        at schemacrawler.crawl.SchemaCrawler.crawl(SchemaCrawler.java:565)
        at schemacrawler.tools.executable.BaseStagedExecutable.execute(BaseStagedExecutable.java:75)
        at schemacrawler.tools.commandline.SchemaCrawlerCommandLine.execute(SchemaCrawlerCommandLine.java:127)
        at schemacrawler.Main.main(Main.java:88)

I'm not sure if this is a problem with schemacrawler at all (because since your fix it does not happen on all database), but maybe you can have another look.

Thanks so much for your effort.

schemacrawler commented 9 years ago

Patric, please make sure that you have the SchemaCrawler jar for Oracle support on your classpath. You can see the classpath that SchemaCrawler uses by turning on logging - use -loglevel=INFO as an additional command-line switch. Please email me directly at sualeh@hotmail.com