Closed maelle closed 2 years ago
@krlmlr I wonder whether this is a new bug introduced by the code changes?
CI https://github.com/cynkra/dm/runs/6963247567?check_suite_focus=true
It seems the example uses either https://relational.fit.cvut.cz/dataset/Financial or a fallback, and until yesterday it was maybe using the fallback?
We're just seeing many more tables now.
Due to code changes.
We can now use the schema
argument to limit that, and we should.
(Maybe?)
Oh: relational.fit was down over the weekend, there is a fallback by @pachadotdev that only has this one table. Could that be the reason?
Either way, schema =
seems worth a shot.
Yes I think that over the weekend you were using the fallback.
Who should add the use of the schema argument in the vignette?
Can you please rephrase?
If the fix is this vignette tweak (using the schema argument), should you or I tweak the vignette? :-)
Related question: should that argument be different based on whether the example is the main one or the fallback one?
Please go ahead, I'm not touching this today or tomorrow.
Can we make the examples work the same for both main and fallback?
Noting that https://relational.fit.cvut.cz/dataset/Financial does not mention 2231 tables so something is wrong (investigating)
it's as if all table variables became one table
or actually am I getting all tables from all databases of the provider?
Example with the fallback database
library(RMariaDB)
library(dm)
#>
#> Attachement du package : 'dm'
#> L'objet suivant est masqué depuis 'package:stats':
#>
#> filter
fin_db <- dm:::dbedu_con()
fin_dm <- dm_from_con(fin_db)
#> Keys queried successfully, use `learn_keys = TRUE` to mute this message.
names(fin_dm)
#> [1] "accounts"
#> [2] "ALL_PLUGINS"
#> [3] "APPLICABLE_ROLES"
#> [4] "cards"
#> [5] "CHARACTER_SETS"
#> [6] "CHECK_CONSTRAINTS"
#> [7] "CLIENT_STATISTICS"
#> [8] "clients"
#> [9] "COLLATION_CHARACTER_SET_APPLICABILITY"
#> [10] "COLLATIONS"
#> [11] "COLUMN_PRIVILEGES"
#> [12] "COLUMNS"
#> [13] "disps"
#> [14] "districts"
#> [15] "ENABLED_ROLES"
#> [16] "ENGINES"
#> [17] "EVENTS"
#> [18] "FILES"
#> [19] "GEOMETRY_COLUMNS"
#> [20] "GLOBAL_STATUS"
#> [21] "GLOBAL_VARIABLES"
#> [22] "INDEX_STATISTICS"
#> [23] "INNODB_BUFFER_PAGE"
#> [24] "INNODB_BUFFER_PAGE_LRU"
#> [25] "INNODB_BUFFER_POOL_STATS"
#> [26] "INNODB_CMP"
#> [27] "INNODB_CMP_PER_INDEX"
#> [28] "INNODB_CMP_PER_INDEX_RESET"
#> [29] "INNODB_CMP_RESET"
#> [30] "INNODB_CMPMEM"
#> [31] "INNODB_CMPMEM_RESET"
#> [32] "INNODB_FT_BEING_DELETED"
#> [33] "INNODB_FT_CONFIG"
#> [34] "INNODB_FT_DEFAULT_STOPWORD"
#> [35] "INNODB_FT_DELETED"
#> [36] "INNODB_FT_INDEX_CACHE"
#> [37] "INNODB_FT_INDEX_TABLE"
#> [38] "INNODB_LOCK_WAITS"
#> [39] "INNODB_LOCKS"
#> [40] "INNODB_METRICS"
#> [41] "INNODB_MUTEXES"
#> [42] "INNODB_SYS_COLUMNS"
#> [43] "INNODB_SYS_DATAFILES"
#> [44] "INNODB_SYS_FIELDS"
#> [45] "INNODB_SYS_FOREIGN"
#> [46] "INNODB_SYS_FOREIGN_COLS"
#> [47] "INNODB_SYS_INDEXES"
#> [48] "INNODB_SYS_SEMAPHORE_WAITS"
#> [49] "INNODB_SYS_TABLES"
#> [50] "INNODB_SYS_TABLESPACES"
#> [51] "INNODB_SYS_TABLESTATS"
#> [52] "INNODB_SYS_VIRTUAL"
#> [53] "INNODB_TABLESPACES_ENCRYPTION"
#> [54] "INNODB_TABLESPACES_SCRUBBING"
#> [55] "INNODB_TRX"
#> [56] "KEY_CACHES"
#> [57] "KEY_COLUMN_USAGE"
#> [58] "KEYWORDS"
#> [59] "loans"
#> [60] "orders"
#> [61] "PARAMETERS"
#> [62] "PARTITIONS"
#> [63] "PLUGINS"
#> [64] "PROCESSLIST"
#> [65] "PROFILING"
#> [66] "REFERENTIAL_CONSTRAINTS"
#> [67] "ROUTINES"
#> [68] "SCHEMA_PRIVILEGES"
#> [69] "SCHEMATA"
#> [70] "SESSION_STATUS"
#> [71] "SESSION_VARIABLES"
#> [72] "sj_all_revenue_large"
#> [73] "sj_all_revenue_large_f"
#> [74] "sj_all_revenue_medium"
#> [75] "sj_all_revenue_medium_f"
#> [76] "sj_all_revenue_small"
#> [77] "sj_all_revenue_small_f"
#> [78] "sj_all_revenue_xlarge"
#> [79] "sj_all_revenue_xlarge_f"
#> [80] "sj_all_sessions_large"
#> [81] "sj_all_sessions_large_f"
#> [82] "sj_all_sessions_medium"
#> [83] "sj_all_sessions_medium_f"
#> [84] "sj_all_sessions_small"
#> [85] "sj_all_sessions_small_f"
#> [86] "sj_all_sessions_xlarge"
#> [87] "sj_all_sessions_xlarge_f"
#> [88] "sj_user_summary_large"
#> [89] "sj_user_summary_large_f"
#> [90] "sj_user_summary_medium"
#> [91] "sj_user_summary_medium_f"
#> [92] "sj_user_summary_small"
#> [93] "sj_user_summary_small_f"
#> [94] "sj_user_summary_xlarge"
#> [95] "sj_user_summary_xlarge_f"
#> [96] "sj_users_daily_large"
#> [97] "sj_users_daily_large_f"
#> [98] "sj_users_daily_medium"
#> [99] "sj_users_daily_medium_f"
#> [100] "sj_users_daily_small"
#> [101] "sj_users_daily_small_f"
#> [102] "sj_users_daily_xlarge"
#> [103] "sj_users_daily_xlarge_f"
#> [104] "SPATIAL_REF_SYS"
#> [105] "SQL_FUNCTIONS"
#> [106] "STATISTICS"
#> [107] "SYSTEM_VARIABLES"
#> [108] "TABLE_CONSTRAINTS"
#> [109] "TABLE_PRIVILEGES"
#> [110] "TABLE_STATISTICS"
#> [111] "TABLES"
#> [112] "TABLESPACES"
#> [113] "tkeys"
#> [114] "trans"
#> [115] "TRIGGERS"
#> [116] "USER_PRIVILEGES"
#> [117] "USER_STATISTICS"
#> [118] "user_variables"
#> [119] "VIEWS"
Created on 2022-06-20 by the reprex package (v2.0.1)
Noting this does not happen with the dm version that's on main. I'm going to look into the PR more closely.
@krlmlr do you agree there's a bug here? :thinking:
Currently looking for a Postgres example to see if that's affected too.
Dear @maelle
I have an example of the exact same DB (Financial) also on PostgreSQL and SQL Server. Please have a look at https://databases.pacha.dev/#financial-aka-loan-application.
Best, Mauricio Vargas S.
Thanks! I was referring to the schema
argument of dm_from_con()
. At some point we'll expose dm_meta()
. It's worth making it consistent with ADBC (see https://github.com/apache/arrow-adbc/pull/18), guessing the schema name isn't something that we can or want to cover in this documentation until we expose dm_meta()
. That's a bit of a catch-22, unfortunately.
library(RMariaDB)
library(dm)
fin_db <- dm:::dbedu_con()
meta <-
fin_db %>%
dm:::dm_meta()
meta$schemata
#> # Source: SQL [3 x 2]
#> # Database: mysql [student@databases.pacha.dev:NA/financial]
#> catalog_name schema_name
#> <chr> <chr>
#> 1 def information_schema
#> 2 def intendo
#> 3 def financial
fin_dm <- dm_from_con(fin_db, learn_keys = TRUE, schema = "financial")
names(fin_dm)
#> [1] "accounts" "cards" "clients" "disps" "districts" "loans"
#> [7] "orders" "tkeys" "trans"
fin_db_relational <- dm:::relational_con()
fin_dm_relational <- dm_from_con(fin_db_relational, learn_keys = TRUE, schema = "financial")
names(fin_dm)
#> [1] "accounts" "cards" "clients" "disps" "districts" "loans"
#> [7] "orders" "tkeys" "trans"
Created on 2022-06-21 by the reprex package (v2.0.1)
It is true that dm_from_con()
now requires the schema
argument for RMariaDB. There, "schema" and "database" are synonymous, and filtering by database in the connection could restrict the metadata returned. I'll take a look. Related: https://github.com/cynkra/dm/pull/1130.
I fixed #1123 so that if dbname
is passed to dbConnect()
we're only getting tables from that schema/database. Closing, let's discuss in new issues if needed.
This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.
I get different table names locally for https://cynkra.github.io/dm/dev/articles/dm.html#creating-a-dm-object
(from PR #1123)