The-Academic-Observatory / oaebu-workflows

Telescopes, Workflows and Data Services for the 'Book Analytics Dashboard Project (2022-2025)', building upon the project 'Developing a Pilot Data Trust for Open Access eBook Usage (2020-2022)'
https://documentation.book-analytics.org/
Apache License 2.0
5 stars 0 forks source link

Changed crossref metadata table load #145

Closed keegansmith21 closed 1 year ago

keegansmith21 commented 1 year ago

The fix from #144 has unfortunately failed in production as BigQuery gives an error when inserting too many rows into a temporary table (query too complex). This PR takes a new approach (and what I believe to be a cleaner one) to the issue. Instead of retrieving the ISBNs from the ONIX feed in python, this is collapsed into the SQL query. This removed the necessity of the isbns_from_onix() function entirely (it's not used for anything else). Furthermore, @jdddog mentioned that it'd be better to simply make the metadata table directly from the query (using create_bigquery_table_from_query()). I have implemented this change as well, which simplifies the metadata table creation significantly. The metadata transform functions are no longer necessary as the data should be transformed prior to the master metadata table creation.

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.10 :tada:

Comparison is base (46cd0a8) 94.24% compared to head (37995b6) 94.34%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #145 +/- ## =========================================== + Coverage 94.24% 94.34% +0.10% =========================================== Files 24 24 Lines 2850 2812 -38 Branches 371 363 -8 =========================================== - Hits 2686 2653 -33 + Misses 79 75 -4 + Partials 85 84 -1 ``` | [Impacted Files](https://app.codecov.io/gh/The-Academic-Observatory/oaebu-workflows/pull/145?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The-Academic-Observatory) | Coverage Δ | | |---|---|---| | [oaebu\_workflows/workflows/onix\_workflow.py](https://app.codecov.io/gh/The-Academic-Observatory/oaebu-workflows/pull/145?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The-Academic-Observatory#diff-b2FlYnVfd29ya2Zsb3dzL3dvcmtmbG93cy9vbml4X3dvcmtmbG93LnB5) | `93.56% <100.00%> (+0.41%)` | :arrow_up: |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.