Open eitsupi opened 7 months ago
Are you proposing to make available all data-frame-like things for accessing them by name? From what environment -- the current environment plus all parent (enclosing) environments?
How would we control this feature? I suspect just enabling it might lead to surprises for existing code.
What about name clashes? Which object gets priority if a table by that name exists already?
What about name clashes? Which object gets priority if a table by that name exists already?
I forget where I read this, but I believe DuckDB has the ability to look for tables that are not on the DB from other places and just use that.
For example, the behavior is already different when the table test.csv
exists and when it does not exist, as shown below.
(Needless to say, if the table test.csv
does not exist, it will look for a CSV file named test.csv and use it as a virtual table.)
data.frame(bar = 2) |>
write.csv("test.csv", row.names = FALSE)
duckdb:::sql('CREATE SCHEMA "test"; CREATE TABLE "test.csv" AS SELECT 1 AS foo; FROM "test.csv"')
#> foo
#> 1 1
Created on 2024-04-24 with reprex v2.0.2
data.frame(bar = 2) |>
write.csv("test.csv", row.names = FALSE)
duckdb:::sql('FROM "test.csv"')
#> bar
#> 1 2
Created on 2024-04-24 with reprex v2.0.2
How would we control this feature? I suspect just enabling it might lead to surprises for existing code.
Given that this is already in use in the Python API, this is hardly a problem.
From what environment -- the current environment plus all parent (enclosing) environments?
Perhaps an additional argument is needed to specify the environment.
Indeed this should be straightforward, the R package can define a so-called replacement scan for this.
I looked into this, it's not that straightforward for arrow regrettably. So for now let's just do this for data.frame
s.
First stab is here: https://github.com/duckdb/duckdb-r/pull/164
From duckdb/duckdb#6771
It is convenient in the Python client to specify the target of a query without having to register
pandas.DataFrame
, etc., so it would be nice to have the same functionality in R.