materialsproject / matbench

Matbench: Benchmarks for materials science property prediction
https://matbench.materialsproject.org
MIT License
124 stars 47 forks source link

Changing the default json writer from json.dump to dumpfn. #251

Closed hrushikesh-s closed 1 year ago

hrushikesh-s commented 1 year ago

Reloading the results.json.gz files!

ardunn commented 1 year ago

Great PR! Simpler implementation than I imagined which is awesome

CI seems to be throwing error though:

----------------------------------------------------------------------
[239](https://github.com/materialsproject/matbench/actions/runs/4593097463/jobs/8110717620?pr=251#step:4:240)
Traceback (most recent call last):
[240](https://github.com/materialsproject/matbench/actions/runs/4593097463/jobs/8110717620?pr=251#step:4:241)
  File "/home/runner/work/matbench/matbench/scripts/test_submission.py", line 61, in test_submissions
[241](https://github.com/materialsproject/matbench/actions/runs/4593097463/jobs/8110717620?pr=251#step:4:242)
    mb = MatbenchBenchmark.from_file(full_path)
[242](https://github.com/materialsproject/matbench/actions/runs/4593097463/jobs/8110717620?pr=251#step:4:243)
  File "/home/runner/work/matbench/matbench/matbench/util.py", line 75, in from_file
[243](https://github.com/materialsproject/matbench/actions/runs/4593097463/jobs/8110717620?pr=251#step:4:244)
    d = json.load(f)
[244](https://github.com/materialsproject/matbench/actions/runs/4593097463/jobs/8110717620?pr=251#step:4:245)
  File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/json/__init__.py", line 293, in load
[245](https://github.com/materialsproject/matbench/actions/runs/4593097463/jobs/8110717620?pr=251#step:4:246)
    return loads(fp.read(),
[246](https://github.com/materialsproject/matbench/actions/runs/4593097463/jobs/8110717620?pr=251#step:4:247)
  File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/codecs.py", line 322, in decode
[247](https://github.com/materialsproject/matbench/actions/runs/4593097463/jobs/8110717620?pr=251#step:4:248)
    (result, consumed) = self._buffer_decode(data, self.errors, final)
[248](https://github.com/materialsproject/matbench/actions/runs/4593097463/jobs/8110717620?pr=251#step:4:249)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
[249](https://github.com/materialsproject/matbench/actions/runs/4593097463/jobs/8110717620?pr=251#step:4:250)

[250](https://github.com/materialsproject/matbench/actions/runs/4593097463/jobs/8110717620?pr=251#step:4:251)
----------------------------------------------------------------------

Are you encoding the files on a windows machine by chance? I know in the past sometimes this it the cause of these weird errors, and since we don't officially support windows (i.e., neither does matminer?) it's ok not to include windows support.

Otherwise, maybe try updating the monty version and re-encoding and see if that works?

hrushikesh-s commented 1 year ago

No. I'm encoding the files on mac.

Are you encoding the files on a windows machine by chance? I know in the past sometimes this it the cause of these weird errors, and since we don't officially support windows (i.e., neither does matminer?) it's ok not to include windows support.

Sure. I'll check if this fixes the issue.

Otherwise, maybe try updating the monty version and re-encoding and see if that works?

ardunn commented 1 year ago

Ok after a quick google search of this, it looks like the problem is now in the load function. You may need to update the load function as well @hrushikesh-s

hrushikesh-s commented 1 year ago

Fixed!