RMI-PACTA / workflow.factset

Other
0 stars 0 forks source link

Update manual sector override list #73

Closed AlexAxthelm closed 4 months ago

AlexAxthelm commented 4 months ago

Remove companies from Sector override list that are no longer matched in FS Database.

Also add Disambiguation for company ("Tosco Corp.") That matches multiple company names in database.

cjyetman commented 4 months ago

I would suggest that changes like this need some robust documentation, e.g. a link to an ADO ticket or something that explains what changes were made and why... otherwise these are "magic" that no one can explain or justify.

github-actions[bot] commented 4 months ago

Docker build status

commit_time git_sha image
2024-05-24T07:59:41Z 3a8417525f8ae26182a4800981072f63ade17137 ghcr.io/rmi-pacta/workflow.factset:pr-73
cjyetman commented 4 months ago

Changes found after documenting. man/pacta_sector_override_mapping.Rd Please update documentation. Error: Process completed with exit code 1.

and

working version: 1.0.0.9002 compare version: 1.0.0.9002 Error: Error: working_version > compare_version is not TRUE Execution halted Error: Process completed with exit code 1.

AlexAxthelm commented 4 months ago

@Antoine-Lalechere The results from this branch are available at factset-pacta_timestamp-20231231T000000Z_pulled-20240522T105733Z. Everything is there except for the manifest and the zip archives.

@cjyetman FYI the manifest process appears to be stalling out. I think It's struggling with the summary info for the Financial Data, since this is the last line of the logs before it crashes:

TRACE [2024-05-22 12:04:22] Getting summary information for file: /mnt/factset-extracted/factset-pacta_timestamp-20231231T000000Z_pulled-20240522T105733Z/timestamp-20231231T000000Z_pulled-20240522T105733Z_factset_financial_data.rds

I'll need to think on how to sort that out.

AlexAxthelm commented 4 months ago

Thanks @Antoine-Lalechere for pointing out that I had escaped some utf8 strings incorrectly. That leaves the following entires on the chopping block:

  entity_proper_name                         pacta_sector
  <chr>                                      <chr>
1 BP Capital Markets PLC                     Oil&Gas
2 General Electric Co.                       Power
3 Toyota Motor Finance Netherlands BV        Other
4 Banco BTG Pactual SA (Grand Cayman Branch) Oil&Gas
5 AES Andres BV                              Power
6 Toyota Leasing (Thailand) Co., Ltd.        Other
7 US Airways Pass Through 2010-1             Aviation
8 Schlumberger NV                            Oil&Gas
Antoine-Lalechere commented 4 months ago

General Electric Co. => GE Aerospace (that one looks weird) Toyota Motor Finance Netherlands BV => Toyota Motor Finance (Netherlands) BV Banco BTG Pactual SA (Grand Cayman Branch) => Banco BTG Pactual SA (Cayman Branch) Toyota Leasing (Thailand) Co., Ltd. => Toyota Leasing (Thailand) Co. Ltd. Schlumberger NV => SLB US Airways Pass Through 2010-1 => US Airways Pass Through Trust 2010-1 AES Andres BV => AES España BV

AlexAxthelm commented 4 months ago

@Antoine-Lalechere nothing for BP Capital Markets PLC?

AlexAxthelm commented 4 months ago

General Electric Co. => GE Aerospace

GE was overriden to "Power" sector previously. Is that still correct?

Antoine-Lalechere commented 4 months ago

yes that's it - but I'm really not confident about it now, let's kick that one out

Antoine-Lalechere commented 4 months ago

BP Capital Markets PLC => BP Capital Markets Plc

AlexAxthelm commented 4 months ago

Results as of commit c100751 factset-pacta_timestamp-20231231T000000Z_pulled-20240523T182756Z @Antoine-Lalechere

AlexAxthelm commented 4 months ago

Marking as ready, pending @Antoine-Lalechere's approval.

(Note also that I've added @Antoine-Lalechere as CODEOWNER on the relevant file, so you'll be tagged in any future PRs affecting it.)

Antoine-Lalechere commented 4 months ago

LGTM

Thanks for adding me where relevant! :)

jdhoffa commented 4 months ago

@Antoine-Lalechere @AlexAxthelm as @cjyetman mentioned, is there a related ADO ticket for this? Would be good to link it here so that we have some documentation of the decision here for posterity