materialsproject / matbench

Matbench: Benchmarks for materials science property prediction
https://matbench.materialsproject.org
MIT License
127 stars 46 forks source link

Setup pre-commit hooks #144

Closed janosh closed 2 years ago

janosh commented 2 years ago

Split out of #126 for clarity.

Includes the usual hooks also used by pymatgen and other MP repos (except for pylint and mypy).

Reasons:

janosh commented 2 years ago

@ardunn I see this conflicts with an existing lint script though not where exactly since scripts/linting_check.sh doesn't report offending line numbers. The hooks added in 6d112fc are a superset of those run in scripts/linting_check.sh so I guess we could just add pre-commit.ci to this repo and remove scripts/linting_check.sh?

ardunn commented 2 years ago

@janosh for sanity sake it might be easiest to just nuke this branch and then copy over the pre-commit hooks on your side manually. Then we can make a minimal new PR and see the actual changes as intended (rather than a bunch of doc changes which are hard to sift through!!)

Also, for reference I don't really care about how rebuild docs looks as long as automatic linting doesn't really inhibit the readability of it. Some of the lines (e.g., automated table creation) will look super weird and be hard to read if they're split up as per typical linting. For this reason, unless the line lengths are really long, I'd recommend not linting rebuild_docs.py unless there is some other major advantage I'm not seeing.

janosh commented 2 years ago

@ardunn Sure thing, I resolved the merge conflicts and excluded rebuild_docs.py in .pre-commit-config.yaml. I also dropped the pre-commit run --all-files commit that applied all the auto changes. Hopefully, that makes it easier to review.

ardunn commented 2 years ago

Sure, is it ready to review now? Or if not just tag me when it is

janosh commented 2 years ago

Yes, I think it's good to go.

janosh commented 2 years ago

Actually, hold on. All the HTML files under docs_src/static/ are not supposed to be in this PR.

janosh commented 2 years ago

There we go, now it's ready for review.

janosh commented 2 years ago
The failed test stems from the matminer file hash issue: ```py ====================================================================== ERROR: test_get_train_and_val_data (matbench.tests.test_task.TestMatbenchTask) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/runner/work/matbench/matbench/matbench/tests/test_task.py", line 42, in test_get_train_and_val_data mbt.load() File "/home/runner/work/matbench/matbench/matbench/task.py", line 237, in load self.df = load(self.dataset_name) File "/home/runner/work/matbench/matbench/matbench/data_ops.py", line 66, in load df = load_dataset(dataset_name) File "/opt/hostedtoolcache/Python/3.9.13/x64/lib/python3.9/site-packages/matminer/datasets/dataset_retrieval.py", line 66, in load_dataset _validate_dataset( File "/opt/hostedtoolcache/Python/3.9.13/x64/lib/python3.9/site-packages/matminer/datasets/utils.py", line 89, in _validate_dataset raise UserWarning( UserWarning: Error, hash of downloaded file does not match that included in metadata, the data may be corrupt or altered ```
ardunn commented 2 years ago

@janosh Ok! I have pushed some commits to try to fix the hash issue on matminer. I think that is the most natural place to fix it. The downside is that this hash problem will continue until the next release of matminer when I will update the matbench requirements...

janosh commented 2 years ago

Does this PR need to wait until then?

ardunn commented 2 years ago

No, I can just merge in now