The-Academic-Observatory / oaebu-workflows

Telescopes, Workflows and Data Services for the 'Book Analytics Dashboard Project (2022-2025)', building upon the project 'Developing a Pilot Data Trust for Open Access eBook Usage (2020-2022)'
https://documentation.book-analytics.org/
Apache License 2.0
5 stars 0 forks source link

Update ONIX parsing and function consolidation #147

Closed keegansmith21 closed 1 year ago

keegansmith21 commented 1 year ago

Originally I had attempted to alter the ONIX parser execution to use a BashOperator, rather than pipe the command to a subprocss in python. This proved to be unexpectedly difficult for many reasons. This may be an easier task once we upgrade airflow and have access to dynamic tasks. For now, I have updated the parser call to run without the shell environment (shell=False).

I have consolidated the onix-related functions into a new file called onix.py. The reason for this is that there are three telescopes that use common onix related functions and I would prefer not to have to import a function from a telescope in another telescope.

Due to the recent refactor, the sftp file directories have changed. This necessitates a change to both the sftp server directories and also the sftp root directory in the onix telescope. With the consideration that the SftpFolders class now generates a unique folder based on the dag_id, I think that it is unnecessarily complex to have an alternate sftp root directory for each onix telescope. Therefore, I have set the default to the root of the filesystem ("/"). Should we require more functionality, this will need to be changed, but it is not in the foreseeable future.

codecov[bot] commented 1 year ago

Codecov Report

:exclamation: No coverage uploaded for pull request base (develop@356b14f). Click here to learn what that means. Patch coverage: 98.79% of modified lines in pull request are covered.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #147 +/- ## ========================================== Coverage ? 95.08% ========================================== Files ? 16 Lines ? 2402 Branches ? 317 ========================================== Hits ? 2284 Misses ? 73 Partials ? 45 ``` | [Impacted Files](https://app.codecov.io/gh/The-Academic-Observatory/oaebu-workflows/pull/147?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The-Academic-Observatory) | Coverage Δ | | |---|---|---| | [oaebu\_workflows/onix.py](https://app.codecov.io/gh/The-Academic-Observatory/oaebu-workflows/pull/147?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The-Academic-Observatory#diff-b2FlYnVfd29ya2Zsb3dzL29uaXgucHk=) | `98.43% <98.43%> (ø)` | | | [...bu\_workflows/workflows/oapen\_metadata\_telescope.py](https://app.codecov.io/gh/The-Academic-Observatory/oaebu-workflows/pull/147?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The-Academic-Observatory#diff-b2FlYnVfd29ya2Zsb3dzL3dvcmtmbG93cy9vYXBlbl9tZXRhZGF0YV90ZWxlc2NvcGUucHk=) | `95.76% <100.00%> (ø)` | | | [oaebu\_workflows/workflows/onix\_telescope.py](https://app.codecov.io/gh/The-Academic-Observatory/oaebu-workflows/pull/147?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The-Academic-Observatory#diff-b2FlYnVfd29ya2Zsb3dzL3dvcmtmbG93cy9vbml4X3RlbGVzY29wZS5weQ==) | `95.00% <100.00%> (ø)` | | | [oaebu\_workflows/workflows/thoth\_telescope.py](https://app.codecov.io/gh/The-Academic-Observatory/oaebu-workflows/pull/147?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The-Academic-Observatory#diff-b2FlYnVfd29ya2Zsb3dzL3dvcmtmbG93cy90aG90aF90ZWxlc2NvcGUucHk=) | `97.72% <100.00%> (ø)` | |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.