InternetHealthReport / internet-yellow-pages

A knowledge graph for the Internet
https://iyp.iijlab.net
GNU General Public License v3.0
43 stars 18 forks source link

Replace sys.exit call with exception handling in crawlers #107

Closed KiranSatyaRaj closed 10 months ago

KiranSatyaRaj commented 10 months ago

Crawlers raise exception in case of fatal error instead of exiting the program

Description

In case of fatal error i.e httperror, jsondecodeerror, missingkeyerror, connectionerror crawlers program were exited but instead now exception are raised and they are passed to the create_db.py where the exceptions are handled. Created custom error types for consistency purposes

Motivation and Context

fixes: issue #70

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

Checklist:

KiranSatyaRaj commented 10 months ago

Hi @m-appel - Please review my pull request.

m-appel commented 10 months ago

Thanks, the changes look good to me, I'm running a test right now, although I don't expect anything to fail.

Please only fix the codestyle issues by (ideally) using the pre-commit hook as described here.

KiranSatyaRaj commented 10 months ago

Hi @m-appel - I'm sorry, I didn't follow the instructions. I will keep this in mind and I'll make sure to follow the instructions for future pull requests. I installed pre-commit and executed pre-commit run --all-files, it nodified some files and again executed pre-commit run --files crawlers __init__.py in iyp venv , there are some whitespace and blank lines errors. shall I fix these and push them?

m-appel commented 10 months ago

Yes please, it should have fixed a bunch of things by itself and as far as I can tell there is one unused import in the atlas_probes crawler that is not fixed automatically for some reason. And don't worry, the github action will double check everything anyways and I will squash your commits when merging, so you can add as many commits as you want if you forgot something :)

KiranSatyaRaj commented 10 months ago

Hi @m-appel - I fixed unused import error, but these blank line error and whitespace errors just won't go even after doing what it suggests me to, my commits are failing, can you please help me with this?

m-appel commented 10 months ago

In theory, the pre-commit hook should automatically update the files. However, you have to stage the changed files again, maybe that's the problem?

The flow is usually as follows:

git commit
# pre-commit hooks are running.
# If there was no failure the commit succeeds.
# If there was a fail due to changed files, you have to stage them again
git add <changed files>
git commit
# Now hooks should pass.

Note that there are some instances (e.g., with flake8) where the hook can not automatically fix the problem, so you have to intervene yourself.

m-appel commented 10 months ago

Okay this is weird. I hereby give you special permission to deactivate the hooks with pre-commit uninstall and push you current state so we can see what's going on :)

KiranSatyaRaj commented 10 months ago

In theory, the pre-commit hook should automatically update the files. However, you have to stage the changed files again, maybe that's the problem?

The flow is usually as follows:

git commit
# pre-commit hooks are running.
# If there was no failure the commit succeeds.
# If there was a fail due to changed files, you have to stage them again
git add <changed files>
git commit
# Now hooks should pass.

Note that there are some instances (e.g., with flake8) where the hook can not automatically fix the problem, so you have to intervene yourself. Thank you so much, it worked. All tests are passed except for autopep8 error:

File "/.cache/pre-commit/repoxqr4119s/py_env-python3.12/lib/python3.12/site-packages/autopep8.py", line 761, in fix_e225
pycodestyle.missing_whitespace_around_operator(fixed, ts))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'pycodestyle' has no attribute 'missing_whitespace_around_operator'. Did you mean: 'whitespace_around_operator'?

I changed it to whitespace_around_operator but autopep8 still fails

m-appel commented 10 months ago

Ah, I think this is because the pre-commit autopep8 hook uses an old version. I updated this on the main branch... The function signature of pycodestyle changed some time ago and this was fixed in autopep8 2.0.3, but the pre-commit hook in your fork still used 2.0.2.

If you fetch from upstream and merge the main branch into your PR branch (this should only update .pre-commit-config.yaml) , and run pre-commit clean once, the next run should (maybe) succeed?

KiranSatyaRaj commented 10 months ago

Ah, I think this is because the pre-commit autopep8 hook uses an old version. I updated this on the main branch... The function signature of pycodestyle changed some time ago and this was fixed in autopep8 2.0.3, but the pre-commit hook in your fork still used 2.0.2.

If you fetch from upstream and merge the main branch into your PR branch (this should only update .pre-commit-config.yaml) , and run pre-commit clean once, the next run should (maybe) succeed?

Everything went well, all checks have passed. I pushed the commit. Thank you @m-appel

KiranSatyaRaj commented 10 months ago

Hi @m-appel - do pre-commit checks still fail? can you please take look into this pull request?

m-appel commented 10 months ago

Yeah, the github worker seems to do something different than the local version... In my version the isort worker failed as well, but the github action wants a different format. I think I'll just merge this and fix it on the main branch, I think somewhere there is a version mismatch in some file I don't understand.

m-appel commented 10 months ago

Thanks, this actually revealed a problem with the pre-commit integration, which I just fixed in https://github.com/InternetHealthReport/internet-yellow-pages/commit/cf73ea20ece43e0ff046e4a744fe522d85d6218e. The only part from your commit I don't understand how it passed your local pre-commit checks is the change in the iyp/wiki/wikihandy.py file. I saw that your IDE added some formatting as well (whitespace before and after operators), which I don't mind to keep, maybe this caused that formatting change? Anyways, I think this is ready to merge, I will double check with Romain before, but will merge it later, thanks for your hard work :)

KiranSatyaRaj commented 10 months ago

I would love to take up some more, thank you so much for helping me all along @m-appel