apache / drill

Apache Drill is a distributed MPP query layer for self describing data
https://drill.apache.org/
Apache License 2.0
1.93k stars 980 forks source link

Unable to access files with colon in their name #2791

Open phraktle opened 1 year ago

phraktle commented 1 year ago

Describe the bug When accessing files in local filesystem with : in their name, a java.net.URISyntaxException is thrown.

To Reproduce

apache drill> SELECT * FROM dfs.`/foo/meta_2022-12-19_16:00:00.csv`;
Error: VALIDATION ERROR: java.net.URISyntaxException: Relative path in absolute URI: .meta_2022-12-19_16:00:00.csv.crc

Expected behavior Should work on local filesystems where : is valid in filenames.

Drill version 1.21.0

kclonts commented 1 year ago

1.21.1 and 1.20.3 both had the issue for me as well on a linux host; tried both java 8 and java 11

Here's a stack trace from the logs

[Error Id: a0132c1c-fef9-435a-90b0-0acc12d076c6 ]
    at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:657)
    at org.apache.drill.exec.planner.sql.conversion.SqlConverter.validate(SqlConverter.java:197)
    at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode(DefaultSqlHandler.java:647)
    at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert(DefaultSqlHandler.java:195)
    at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:169)
    at org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:283)
    at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan(DrillSqlWorker.java:163)
    at org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:128)
    at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:93)
    at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:593)
    at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:274)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: .sr:match:41481731.json.crc
    at org.apache.hadoop.fs.Path.initialize(Path.java:263)
    at org.apache.hadoop.fs.Path.<init>(Path.java:221)
    at org.apache.hadoop.fs.Path.<init>(Path.java:129)
    at org.apache.hadoop.fs.ChecksumFileSystem.getChecksumFile(ChecksumFileSystem.java:96)
    at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:151)
    at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:349)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:906)
    at org.apache.drill.exec.store.dfs.DrillFileSystem.open(DrillFileSystem.java:152)
    at org.apache.drill.exec.store.dfs.BasicFormatMatcher$MagicStringMatcher.matches(BasicFormatMatcher.java:145)
    at org.apache.drill.exec.store.dfs.BasicFormatMatcher.isFileReadable(BasicFormatMatcher.java:110)
    at org.apache.drill.exec.store.parquet.ParquetFormatPlugin$ParquetFormatMatcher.isDirReadable(ParquetFormatPlugin.java:409)
    at org.apache.drill.exec.store.parquet.ParquetFormatPlugin$ParquetFormatMatcher.isReadable(ParquetFormatPlugin.java:357)
    at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema$FileSelectionInspector.matchFormat(WorkspaceSchemaFactory.java:867)
    at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:605)
    at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:295)
    at org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.getNewEntry(ExpandingConcurrentMap.java:96)
    at org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.get(ExpandingConcurrentMap.java:90)
    at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.getTable(WorkspaceSchemaFactory.java:452)
    at org.apache.drill.exec.store.dfs.FileSystemSchemaFactory$FileSystemSchema.getTable(FileSystemSchemaFactory.java:127)
    at org.apache.calcite.jdbc.SimpleCalciteSchema.getImplicitTable(SimpleCalciteSchema.java:83)
    at org.apache.calcite.jdbc.CalciteSchema.getTable(CalciteSchema.java:289)
    at org.apache.calcite.sql.validate.EmptyScope.resolve_(EmptyScope.java:143)
    at org.apache.calcite.sql.validate.EmptyScope.resolveTable(EmptyScope.java:99)
    at org.apache.calcite.sql.validate.DelegatingScope.resolveTable(DelegatingScope.java:203)
    at org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl(IdentifierNamespace.java:105)
    at org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl(IdentifierNamespace.java:177)
    at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
    at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:1009)
    at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:969)
    at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3129)
    at org.apache.drill.exec.planner.sql.conversion.DrillValidator.validateFrom(DrillValidator.java:80)
    at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3111)
    at org.apache.drill.exec.planner.sql.conversion.DrillValidator.validateFrom(DrillValidator.java:80)
    at org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3383)
    at org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
    at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
    at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:1009)
    at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:969)
    at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:216)
    at org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:944)
    at org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:651)
    at org.apache.drill.exec.planner.sql.conversion.SqlConverter.validate(SqlConverter.java:190)
    ... 12 common frames omitted
Caused by: java.net.URISyntaxException: Relative path in absolute URI: .example:example:example.json.crc
    at java.base/java.net.URI.checkPath(URI.java:1998)
    at java.base/java.net.URI.<init>(URI.java:780)
    at org.apache.hadoop.fs.Path.initialize(Path.java:260)
    ... 53 common frames omitted
2023-08-13 21:02:09,711 [1b26bb2d-a236-be2d-223e-0b0862960c6a:foreman] INFO  o.a.d.e.p.s.conversion.SqlConverter - User Error Occurred: java.net.URISyntaxException: Relative path in absolute URI: .example:example:example.json.crc (java.net.URISyntaxException: Relative path in absolute URI: .example:example:example.json.crc)
org.apache.drill.common.exceptions.UserException: VALIDATION ERROR: java.net.URISyntaxException: Relative path in absolute URI: .example:example:example.json.crc