Graph Data Science: an abstraction layer in Python for building knowledge graphs, integrated with popular graph libraries – atop Pandas, NetworkX, RAPIDS, RDFlib, pySHACL, PyVis, morph-kgc, pslpython, pyarrow, etc.
@ceteri @louisguitton
I'm sure this PR will look somewhat familiar to you, feel free to skim it, as I used my PR from pytextrank as a template.
Pull request overview
Renamed .github/workflows/docker-demo.yml to .github/workflows/ci.yml.
Add pre-commit to .github/workflows/docker-demo.yml to run on every push, pull request or manually through the Actions tab on GitHub.
Add codespell to pre-commit.
Fixed three simple spelling mistakes reported by codespell.
Resolved all failing pre-commit warnings.
Now, let's talk about what I've done:
Renamed docker-demo.yml to ci.yml, and updating it
The updated .github/workflows/ci.yml file works the same as before, with the one addition: It now also runs pre-commit. The renaming was a personal choice. It matches the style used for pytextrank, and it feels more professional than a name that includes "demo".
It does so whenever someone pushes, creates a pull request, or uses the button in the GitHub Actions pane.
Note that pre-commit in particular is very strict. I'm sure you know this. But it will cause the CI to fail if there's a typo in a comment, a single pylint warning, a security warning, a failing test, etc. This could prevent pull requests from being merged until these warnings are resolved. However, this might be exactly what you are after.
Note: The first execution of this workflow took me approximately 8.5 minutes. This time includes 3 minutes for caching the docker. All subsequent executions (I presume, I've only tested one extra run) take less than 5 minutes. You can see the timings here.
I've added codespell to the pre-commit config. The outputs before applying the changes in this PR include:
Some "typos" in the output sections of Jupyter Notebooks.
These warnings were removed by excluding Jupyter Notebooks from being checked.
Also some typos in the non-output section of Jupyter Notebooks.
These were manually checked and removed, as can be seen in the commit history, before Jupyter Notebooks were removed from being checked by codespell.
wit in what.md. Used in the context of "To wit, ...".
I've added wit as an allowed word to fix this.
reenabled => re-enabled in pylintrc
I've simply fixed this by adding the hyphen.
Resolving pre-commit warnings.
After solving the warnings from codespell, some more changes needed to be made to remove all pre-commit warnings. In particular:
Apply the fixes described in the Codespell section above.
Exclude the bin folder from bandit. This folder contains some flask code to launch a sample website - and has debugging enabled. According to bandit, this is a security flaw. Clearly, this is not meant for production, and does not need to be tested for security flaws.
Exclude all tests run on dat. Clearly, this folder simply holds data, and we don't wish to spell-check this raw data, or verify that these files have the correct shebangs.
The check-executables-have-shebangs hook complained that dat/recipes.csv is executable, without having a correct shebang. Clearly, we don't need this hook to check data files.
What now?
All tests, like before, pass. In addition, the CI passes, meaning that all elements of pre-commit pass. I would recommend looking over my changes, and looking at my last GitHub Actions CI run. I do recognise that some people prefer testing on their own machine, in which case you can roughly follow:
# Clone my repo
git clone https://github.com/tomaarsen/kglab
cd kglab
# Set up a virtualenvironment (Note, this may differ depending on your OS, I don't need to tell you this)
python -m venv venv
.env39/Scripts/Activate.ps1
# Install requirements
pip install -r requirements.txt
pip install -r requirements-dev.txt
# Run pre-commit on all files
pre-commit run -a
# And then delete the new folder when you've verified that everything works.
You'll see that pre-commit runs without any issues, and this is a big part of what the CI will be doing (alongside running pytest).
Furthermore, if you choose to merge this PR, feel free to add me as a contributor on your README.
Lastly, you may need to enable GitHub Actions somewhere within the repository settings (Settings -> Actions -> Actions Permissions)
Hello!
@ceteri @louisguitton I'm sure this PR will look somewhat familiar to you, feel free to skim it, as I used my PR from
pytextrank
as a template.Pull request overview
.github/workflows/docker-demo.yml
to.github/workflows/ci.yml
.pre-commit
to.github/workflows/docker-demo.yml
to run on every push, pull request or manually through the Actions tab on GitHub.codespell
topre-commit
.codespell
.pre-commit
warnings.Now, let's talk about what I've done:
Renamed docker-demo.yml to ci.yml, and updating it
The updated
.github/workflows/ci.yml
file works the same as before, with the one addition: It now also runspre-commit
. The renaming was a personal choice. It matches the style used forpytextrank
, and it feels more professional than a name that includes "demo".It does so whenever someone pushes, creates a pull request, or uses the button in the GitHub Actions pane.
Note that
pre-commit
in particular is very strict. I'm sure you know this. But it will cause the CI to fail if there's a typo in a comment, a single pylint warning, a security warning, a failing test, etc. This could prevent pull requests from being merged until these warnings are resolved. However, this might be exactly what you are after.Note: The first execution of this workflow took me approximately 8.5 minutes. This time includes 3 minutes for caching the docker. All subsequent executions (I presume, I've only tested one extra run) take less than 5 minutes. You can see the timings here.
To see the exact output generated, click here.
Codespell
I've added
codespell
to thepre-commit
config. The outputs before applying the changes in this PR include:wit
inwhat.md
. Used in the context of"To wit, ..."
.wit
as an allowed word to fix this.reenabled => re-enabled
inpylintrc
Resolving
pre-commit
warnings.After solving the warnings from
codespell
, some more changes needed to be made to remove allpre-commit
warnings. In particular:bin
folder frombandit
. This folder contains some flask code to launch a sample website - and has debugging enabled. According tobandit
, this is a security flaw. Clearly, this is not meant for production, and does not need to be tested for security flaws.dat
. Clearly, this folder simply holds data, and we don't wish to spell-check this raw data, or verify that these files have the correct shebangs.check-executables-have-shebangs
hook complained thatdat/recipes.csv
is executable, without having a correct shebang. Clearly, we don't need this hook to check data files.What now?
All tests, like before, pass. In addition, the CI passes, meaning that all elements of
pre-commit
pass. I would recommend looking over my changes, and looking at my last GitHub Actions CI run. I do recognise that some people prefer testing on their own machine, in which case you can roughly follow:You'll see that
pre-commit
runs without any issues, and this is a big part of what the CI will be doing (alongside runningpytest
). Furthermore, if you choose to merge this PR, feel free to add me as a contributor on your README.Lastly, you may need to enable GitHub Actions somewhere within the repository settings (Settings -> Actions -> Actions Permissions)
Let me know if you need anything else from me,