Closed luna02100 closed 7 months ago
Could you please create a reprex for this? See https://dbplyr.tidyverse.org/articles/reprex.html for more information how to do this.
And could you check whether this still happens with dbplyr 2.4.0 or even better if you check with the dev version: devtools::install_github("tidyverse/dbplyr")
. Thanks 😄
Alright.
library(dplyr)
library(dbplyr)
flights_lazy <- lazy_frame(nycflights13::flights,con = simulate_odbc())
weather_lazy <- lazy_frame(nycflights13::weather,con = simulate_odbc())
After I uninstall dbplyr
and reinstall it via devtools::install_github("tidyverse/dbplyr")
:
flights_query_lazy <- flights_lazy %>%
left_join(weather_lazy)
flights_query_lazy %>% show_query()
Output:
<SQL>
SELECT
`df_LHS`.*,
`temp`,
`dewp`,
`humid`,
`wind_dir`,
`wind_speed`,
`wind_gust`,
`precip`,
`pressure`,
`visib`
FROM `df` AS `df_LHS`
LEFT JOIN `df` AS `df_RHS`
ON (
`df_LHS`.`year` = `df_RHS`.`year` AND
`df_LHS`.`month` = `df_RHS`.`month` AND
`df_LHS`.`day` = `df_RHS`.`day` AND
`df_LHS`.`origin` = `df_RHS`.`origin` AND
`df_LHS`.`hour` = `df_RHS`.`hour` AND
`df_LHS`.`time_hour` = `df_RHS`.`time_hour`
)
And could you check whether this still happens with dbplyr 2.4.0 or even better if you check with the dev version:
devtools::install_github("tidyverse/dbplyr")
. Thanks 😄
flights_query <- tbl(sample_con, "flights") %>%
left_join(tbl(sample_con, "weather"))
flights_query %>% show_query()
SELECT
`flights`.*,
`row_names`,
`temp`,
`dewp`,
`humid`,
`wind_dir`,
`wind_speed`,
`wind_gust`,
`precip`,
`pressure`,
`visib`
FROM `flights`
LEFT JOIN `weather`
ON (
`flights`.`year` = `weather`.`year` AND
`flights`.`month` = `weather`.`month` AND
`flights`.`day` = `weather`.`day` AND
`flights`.`origin` = `weather`.`origin` AND
`flights`.`hour` = `weather`.`hour` AND
`flights`.`time_hour` = `weather`.`time_hour`
)
As you can see there is no row_names
column.
The second part is not a reprex as an important part is missing: how was the data written to the database. I am pretty sure you unintentionally created a row_names
column when writing the data to the database. You could check that e.g. with DBI::dbListFields(sample_con, "weather")
.
Yeah, I see it now. After I copy the weather
dataset from nycflights13
to my SQL database connection, the row_names
column was spontaneously created. So I thought it might be a bug because when I overwrite the weather
data frame, the row_names
column was now clear.
I am using
flights
andweather
data frames from thenycflights13
and I store them in my SQL database (sample_con
). Where does thisrow_names
came from?