Closed AlexAxthelm closed 4 months ago
Docker image from this PR (d99a9765a20e04962610ab9e1484c4c6b466d68e) created
docker pull ghcr.io/rmi-pacta/workflow.factset:pr46
@cjyetman @jdhoffa I've linked to a file generated against the latest FS database on Teams.
this is the output I'm getting from waldo::compare
pr46 <- readRDS("timestamp-20230123T000000Z_pulled-20000101T000001_factset_industry_map_bridge.rds")
waldo::compare(pr46, pacta.data.preparation::factset_industry_map_bridge)
#> `class(old)`: "tbl_df" "tbl" "data.frame"
#> `class(new)`: "spec_tbl_df" "tbl_df" "tbl" "data.frame"
#>
#> `attr(old, 'problems')` is absent
#> `attr(new, 'problems')` is a pointer
#>
#> `attr(old, 'spec')` is absent
#> `attr(new, 'spec')` is an S3 object of class <col_spec>, a list
Created on 2024-02-14 with reprex v2.0.2
It looks like there's some differences in the class and attributes that I'm not sure if they are important or not, but as far as the data contents go, we're on track.
@cjyetman @jdhoffa I've linked to a file generated against the latest FS database on Teams.
this is the output I'm getting from
waldo::compare
pr46 <- readRDS("timestamp-20230123T000000Z_pulled-20000101T000001_factset_industry_map_bridge.rds") waldo::compare(pr46, pacta.data.preparation::factset_industry_map_bridge) #> `class(old)`: "tbl_df" "tbl" "data.frame" #> `class(new)`: "spec_tbl_df" "tbl_df" "tbl" "data.frame" #> #> `attr(old, 'problems')` is absent #> `attr(new, 'problems')` is a pointer #> #> `attr(old, 'spec')` is absent #> `attr(new, 'spec')` is an S3 object of class <col_spec>, a list
Created on 2024-02-14 with reprex v2.0.2
Session info It looks like there's some differences in the class and attributes that I'm not sure if they are important or not, but as far as the data contents go, we're on track.
I'm near certain that the difference here is that the new
object in this comparison was imported using readr::read_csv()
, which adds the problems
and spec
attributes to review import issues and the spec_tbl_df
class to the data.frames/tibbles that it creates.
i.e. not a concern (from my side)
Adds
get_industry_map_bridge()
to replacepacta.data.preparation::factset_industry_map_bridge
and the manual steps that were involved in creating that.Similar to #39, implements current state of that file, without examining the accuracy of those mappings (a problem for a later date)
Closes #45