[x] README overall. needs updating but structure present
[x] Find a way to link that HOWTO from README
[x] HOWTO which contains what to expect, what it gives and how to proceed
[x] Directory structure. Organized.
[ ] Packaging (so that it also runs on the cluster).
Currently it does not package any other dependency in a whl file (assumes cluster has all the dependency requirements).
In a second iteration, a docker could be used to build dependency, package and deliver
[x] sphinx documentation. Mostly okay.
[ ] howto_link.rst is real ugly. See if it can be fixed.
Could not figure out a simple way as HOWTO exists outside docs directory.
[x] Combine the root/docs/Makefile of docs with root/Makefile?
[x] mypy
[x] GitHub action
[x] pytest > fails pipeline if test fails
[ ] coverage > reports cov
Coverage is collected but a comment is not created. https://docs.codecov.io/docs/team-bot GH admin would need to do this for a private repo. As this is still not a public repo, it will not comment/update it at the moment without this modification.
[x] flake8 (or laternative) (with type annotation & docstring plugin) > fails if not met
Used the .rst and not the .md file for README and HOWTO.
Reason: rst is the de-facto standard for sphinx documentation, so the docs folder will mostly need .rst files. It is better to be consistent across the documentation with only .rst files. Lots of discussion on what is better, but it is generally agreed that rst is better for medium-bigger documentation. rst can do everything/almost everything that a .md file does but not the other way around.
Poetry is planned to be used for package management in this project. Apparently it was not used in ml-skeleton-py as it is opinionated. My personal opinion is that it is okay to be opinionated on this from the maintenance perspective.
tox is growing in popularity for standard testing. As we already have a MakeFile there will be an overlap. tox is thus removed. itHub actions does the testing. And can be extended to matrix if you want to test other envs
black can sometimes mess things up. We still keep it?
docstring is of Google Format. I find it easier to read. I am not strongly opinionated on this, so please feel free to recommend
Description
The first version of the Pyspark skeleton repository for a data pipeline.
Related to Notion initiative
Checklist
[x] Testing (Which module, local testing)
[x] linting (flake8, black)
[x] README overall. needs updating but structure present
[x] HOWTO which contains what to expect, what it gives and how to proceed
[x] Directory structure. Organized.
[ ]
Packaging (so that it also runs on the cluster).Currently it does not package any other dependency in a whl file (assumes cluster has all the dependency requirements). In a second iteration, a docker could be used to build dependency, package and deliver[x] sphinx documentation. Mostly okay.
howto_link.rst is real ugly. See if it can be fixed.Could not figure out a simple way as HOWTO exists outside docs directory.[x] mypy
[x] GitHub action
coverage > reports covCoverage is collected but a comment is not created. https://docs.codecov.io/docs/team-bot GH admin would need to do this for a private repo. As this is still not a public repo, it will not comment/update it at the moment without this modification.[x] Makefile
[x] logging
Discussion
Used the .rst and not the .md file for README and HOWTO. Reason: rst is the de-facto standard for sphinx documentation, so the docs folder will mostly need .rst files. It is better to be consistent across the documentation with only .rst files. Lots of discussion on what is better, but it is generally agreed that rst is better for medium-bigger documentation. rst can do everything/almost everything that a .md file does but not the other way around.
Poetry is planned to be used for package management in this project. Apparently it was not used in ml-skeleton-py as it is opinionated. My personal opinion is that it is okay to be opinionated on this from the maintenance perspective.
tox is growing in popularity for standard testing. As we already have a MakeFile there will be an overlap. tox is thus removed. itHub actions does the testing. And can be extended to matrix if you want to test other envs
black can sometimes mess things up. We still keep it?
docstring is of Google Format. I find it easier to read. I am not strongly opinionated on this, so please feel free to recommend