Open rad-pat opened 1 month ago
So it seems that including a trailing slash on the end of the path makes it behave correctly. I can include the slash, but since it always exports one or many parquet files to the location, should it not be assumed that the location is always a path, or at least that /tables/
is the path and t1
is the file(?? for the one or many files)
Works correctly:
CREATE table t1 (c1 int null);
INSERT INTO t1 values (1), (2), (3);
COPY INTO 'gcs://bucket/tables/t1/'
CONNECTION = (
CREDENTIAL = '<snip>'
)
FROM default.t1
FILE_FORMAT = (TYPE = PARQUET);
@rad-pat thank you. it is bug.
@youngsofun , presume this is fixed now with #16321?
Was this affecting internal storage if GCS is used, or would that have remained unaffected?
it should have been fixed, please have a try
Yes, seems fixed for COPY INTO, thanks. I just wondered if there was any effect to the parquet files stored by the system whilst this bug was happening?
The behavior of the bug is as follows:
If your location string does not end with a /, copying into bucket/<path>
will result in bucket/<path>/<path>/<file_name_containing_uuid>
instead of bucket/<path>/<file_name_containing_uuid>
While itโs unfortunate to make this mistake, I donโt think itโs a major issue in practice, especially if you are only using it for unloading. The additional <path>/
can be considered part of the randomly generated path created by Databend.
Search before asking
Version
v1.2.618-nightly
What's Wrong?
When issuing a COPY INTO command for GCS, the resulting path in GCS is duplicated
How to Reproduce?
Looks in GCS, see that path is bucket/tables/t1/tables/t1
Are you willing to submit PR?