Segfault-Inc / Multicorn

Data Access Library
https://multicorn.org/
PostgreSQL License
699 stars 145 forks source link

FilesystemFdw: Enhance pattern matching and timestamp support #205

Open rvernica opened 6 years ago

rvernica commented 6 years ago

Highlights:

This PR adds four more options to FilesystemFdw:

With this PR, the following example is supported:

> ls -R1 f
f:
taz

f/taz:
a_b.jpeg
a_b.JPEG
a-b.jpg
a_b.png
a b.PNG
CREATE FOREIGN TABLE foo (
    filename VARCHAR,
    mtime TIMESTAMP,
    ctime TIMESTAMP,
    foo VARCHAR,
    bar VARCHAR,
    taz VARCHAR
) SERVER filesystem_srv OPTIONS (
    root_dir        '/f',
    pattern         '{taz}/{foo}[ _-]{bar}\.(jpe?g|png)',
    escape_pattern  'FALSE',
    ignore_case     'TRUE',
    filename_column 'filename',
    mtime_column    'mtime',
    ctime_column    'ctime');
SELECT * FROM foo;
 filename |        mtime        |        ctime        | foo | bar | taz 
----------+---------------------+---------------------+-----+-----+-----
 a_b.jpeg | 2018-04-19 18:37:06 | 2018-04-20 19:20:12 | a   | b   | taz
 a_b.png  | 2018-04-19 18:37:04 | 2018-04-20 19:20:12 | a   | b   | taz
 a-b.jpg  | 2018-04-19 18:37:30 | 2018-04-20 19:20:12 | a   | b   | taz
 a_b.JPEG | 2018-04-19 18:37:09 | 2018-04-20 19:20:12 | a   | b   | taz
 a b.PNG  | 2018-04-19 18:37:15 | 2018-04-20 19:20:12 | a   | b   | taz
(5 rows)

Notice the regular expression used for the pattern option which allows for the foo and bar tokens to be separated by `,_, or-. Also, the file extensions can be.jpg,.jpeg, or.png`, case insensitive.

Fix for #203