schemacrawler / SchemaCrawler

Free database schema discovery and comprehension tool
http://www.schemacrawler.com/
Other
1.58k stars 200 forks source link

MixedCharacters in TableName results in Pattern match failure #1545

Closed talha31093 closed 1 month ago

talha31093 commented 2 months ago

Description

Table Not Found because of extra quotes in cases where table name has mixed case Keep schema as null Create a table with mixed case like EngagementsTasks Now try to fetch the table with filter "EngagementsTasks" it will not return anything

It is similar to these issues - https://github.com/schemacrawler/SchemaCrawler/issues/1053#issuecomment-1551716919 As it turns out, mixing lowercase and uppercase characters also triggers the quotes. This is exactly the problem. https://github.com/schemacrawler/SchemaCrawler/issues/1130 https://github.com/schemacrawler/SchemaCrawler/issues/1460

How to Reproduce

final Catalog catalog = SchemaCrawlerUtility.getCatalog(scDatabaseConnectionSource,
schemaCrawlerOptionsBuilder("EngagementsTasks"));
private static SchemaCrawlerOptions schemaCrawlerOptionsBuilder(String tableNamePatternForPage) {
LimitOptions limitOptions = schemaCrawlerLimitOptionsBuilder(tableNamePatternForPage).toOptions();
LoadOptions loadOptions = schemaCrawlerLoadOptionsBuilder().toOptions();
return SchemaCrawlerOptionsBuilder.newSchemaCrawlerOptions().withLimitOptions(limitOptions).withLoadOptions(
loadOptions);
}
private static LimitOptionsBuilder schemaCrawlerLimitOptionsBuilder(String tableNamePatternForPage) {
final LimitOptionsBuilder limitOptionsBuilder = LimitOptionsBuilder.builder();
limitOptionsBuilder.tableTypes(Arrays.asList("TABLE"));
if (tableNamePatternForPage != null) {
limitOptionsBuilder.includeTables(Pattern.compile(tableNamePatternForPage, Pattern.CASE_INSENSITIVE));
}
}
private static LoadOptionsBuilder schemaCrawlerLoadOptionsBuilder() {
return LoadOptionsBuilder.builder().withSchemaInfoLevel(getSchemaInfoLevel());
}
private static SchemaInfoLevel getSchemaInfoLevel() {
System.out.println("IS_FIELD_LIST_REQUESTED: " + true);
return SchemaInfoLevelBuilder.builder()
.fromOptions(SchemaInfoLevelBuilder.minimum())
.setRetrieveTables(true)
.setRetrieveViewInformation(true)
.setRetrieveTableColumns(true)
.setRetrieveColumnDataTypes(true)
.setRetrieveDatabaseInfo(true)

  .setRetrieveAdditionalColumnAttributes(true)
  .setRetrieveAdditionalColumnMetadata(true)
  .setRetrieveAdditionalTableAttributes(true)

  .setRetrieveAdditionalDatabaseInfo(false)
  .setRetrieveDatabaseUsers(false)
  .setRetrievePrimaryKeys(true)
  .setRetrieveForeignKeys(false)
  .setRetrieveIndexes(false) //get indexes along with FK
  .setRetrieveIndexInformation(false)
  .setRetrieveRoutineInformation(false)
  .setRetrieveRoutines(false)
  .setRetrieveRoutineParameters(false)
  .setRetrieveSequenceInformation(false)
  .setRetrieveServerInfo(false)
  .setRetrieveSynonymInformation(false)
  .setRetrieveTableColumnPrivileges(false)
  .setRetrieveTableConstraintDefinitions(false)
  .setRetrieveTableConstraintInformation(false)
  .setRetrieveTablePrivileges(false)
  .setRetrieveTriggerInformation(false)
  .setRetrieveUserDefinedColumnDataTypes(false)
  .toOptions();

Just doing this - new MutableTable(schema, tableName) it returns ""EngagementsTasks"" instead of "EngagementsTasks" and hence, the filter condition tableFilter.test(table) fails when the inclusion rule is set using limitOptionsBuilder.includeTables(Pattern.compile("EngagementsTasks", Pattern.CASE_INSENSITIVE));

I'm using the default schema retrieval options with Identifiers STANDARD

Workarounds that works are:

  1. Hardcode quoteMixedCaseIdentifiers as false
    
    Identifiers(final IdentifiersBuilder builder) {
    if (builder.isIdentifierQuoteStringSet()) {
      identifierQuoteString = builder.identifierQuoteString;
    } else {
      // SQL standard and JDBC default is double quotes
      identifierQuoteString = "\"";
    }
    identifierQuotingStrategy = builder.identifierQuotingStrategy;
    quoteMixedCaseIdentifiers = false;
    reservedWords = builder.reservedWords;
    }
2. Use this inclusion rule : 

limitOptionsBuilder.includeTables(Pattern.compile(.tableNamePattern("\"EnagementsTasks\""), Pattern.CASE_INSENSITIVE));

### Relevant log output

```shell
May 01, 2024 5:14:32 PM schemacrawler.crawl.SchemaCrawler crawl
INFO: 
-- generated by: SchemaCrawler 16.21.2
-- generated on: 2024-05-01T11:44:31.775646
-- database: CData JDBC Driver for HubSpot 2023 23.0.8770
-- driver: CData JDBC Driver for HubSpot 2023 23.0.8770.0
-- operating system: Mac OS X 13.2.1
-- JVM system: Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 17.0.10+11-LTS-240

May 01, 2024 5:14:32 PM us.fatehi.utility.scheduler.TimedTask call
INFO: Running <retrieve_schemas> on thread <pool-1-thread-5>
May 01, 2024 5:14:32 PM schemacrawler.crawl.SchemaRetriever retrieveAllSchemas
INFO: Retrieving all schemas
May 01, 2024 5:14:32 PM schemacrawler.crawl.SchemaRetriever retrieveAllCatalogs
INFO: Retrieving all catalogs
May 01, 2024 5:14:32 PM us.fatehi.utility.scheduler.TimedTask call
INFO: Running <filter_and_sort_schemas> on thread <pool-1-thread-6>
May 01, 2024 5:14:32 PM schemacrawler.crawl.SchemaCrawler crawlSchemas
INFO: Retrieved 1 schemas
May 01, 2024 5:14:32 PM us.fatehi.utility.scheduler.TimedTask call
INFO: Running <retrieve_column_data_types> on thread <pool-1-thread-7>
May 01, 2024 5:14:32 PM schemacrawler.crawl.DataTypeRetriever retrieveSystemColumnDataTypes
INFO: Retrieving system column data types
May 01, 2024 5:14:32 PM schemacrawler.crawl.DataTypeRetriever retrieveSystemColumnDataTypesFromMetadata
INFO: Processed 16 system column data types
May 01, 2024 5:14:32 PM us.fatehi.utility.scheduler.TimedTask call
INFO: Running <retrieve_user_defined_column_data_types> on thread <pool-1-thread-8>
May 01, 2024 5:14:32 PM us.fatehi.utility.scheduler.TaskDefinition lambda$new$0
INFO: Not running task <retrieveUserDefinedColumnDataTypes>
May 01, 2024 5:14:32 PM us.fatehi.utility.scheduler.TimedTask call
INFO: Running <retrieve_tables> on thread <pool-1-thread-9>
May 01, 2024 5:14:32 PM schemacrawler.crawl.SchemaCrawler lambda$crawlTables$9
INFO: Retrieving table names
May 01, 2024 5:14:32 PM schemacrawler.crawl.TableRetriever retrieveTables
INFO: Retrieving tables
May 01, 2024 5:14:32 PM schemacrawler.crawl.TableRetriever retrieveTablesFromMetadata
INFO: Retrieving tables for schema <>
May 01, 2024 5:14:35 PM schemacrawler.crawl.TableRetriever retrieveTablesFromMetadata
INFO: Processed 86 tables
May 01, 2024 5:14:35 PM schemacrawler.crawl.SchemaCrawler crawlTables
INFO: Retrieved 0 tables
May 01, 2024 5:14:35 PM schemacrawler.crawl.SchemaCrawler crawlRoutines
INFO: Not retrieving routines, since this was not requested
May 01, 2024 5:14:35 PM schemacrawler.crawl.SchemaCrawler crawlSynonyms
INFO: Not retrieving synonyms, since this was not requested
May 01, 2024 5:14:35 PM schemacrawler.crawl.SchemaCrawler crawlSequences
INFO: Not retrieving sequences, since this was not requested
May 01, 2024 5:14:35 PM schemacrawler.crawl.RetrievalTaskRunner stopAndLogTime
INFO: Total time taken for <f48a914d-8d9f-45ad-a385-f06dc9922a75> - 00:00:02.689 hours
-  0.1% - 00:00:00.002 - <retrieve_additional_database_info>
-  0.0% - 00:00:00.000 - <retrieve_server_info>
-  0.0% - 00:00:00.000 - <retrieve_database_users>
-  0.1% - 00:00:00.002 - <retrieve_additional_jdbc_driver_info>
-  4.6% - 00:00:00.124 - <retrieve_schemas>
-  0.1% - 00:00:00.002 - <filter_and_sort_schemas>
-  8.8% - 00:00:00.236 - <retrieve_column_data_types>
-  0.0% - 00:00:00.000 - <retrieve_user_defined_column_data_types>
- 86.2% - 00:00:02.319 - <retrieve_tables>

SchemaCrawler Version

16.21.2

Java Version

17

Operating System and Version

MacOs Ventura 13.2.1

Relational Database System and Version

Hubspot

JDBC Driver and Version

Cdata

sualeh commented 1 month ago

@talha31093 I have taken your code, and I cannot reproduce the error. Please take a look at the project I have provided with your code - https://github.com/sualeh/schemacrawler-issue1460 Follow the instructions in the README.