Closed shantstepanian closed 6 years ago
I tested against a large DB2 schema, and found that MetadataRetrievalStrategy.metadata_all
was faster. However, I can see that MetadataRetrievalStrategy.metadata
would be faster for subsets of tables. I was under the impression that this can be changed using the config file. I will have to see if it is possible to override using the config file or code. Let me write some test code for that.
It would probably be best to make this overrideable via code, as the right value to use depends on whether you are searching for all tables or just a subset. We ran into this issue for Oracle as well (using the data_dictionary MetadataRetrievalStrategy was slower for a small subset of tables on a large database).
As of now, we are fine as we are overriding the value appropriately when doing a query for a single table vs all tables, though having it done via the API would be helpful.
I have added a unit test to confirm that these values can be overridden in the config file, and also updated the default config files with examples.
Environment:
SchemaCrawler version 14.16.04-java7 on Java 7 Running on Linux against DB2 10.5 and JDBC driver version 3.4.65 - 9.1
Please explain the issue briefly
Crawl for DB2 table metadata for a single table takes a long time as we are querying for all columns no matter what.
It is going via the TableColumnRetriever.retrieveColumnsFromMetadataForAllTables as we specify metadata_all for the default DB2 value (see here, this looks to be the only implementation using metadata_all).
My query is using the table inclusion rule for tables, so using metadata should be fine. In my local testing, I can just set the value to MetadataRetrievalStrategy.metadata, and it works fine.
Any reason why metadata_all was specified for DB2 in this case? I can work around this locally and submit a fix later, but I'd like to ensure that I didn't miss anything obvious