josdejong / jsonrepair

Repair invalid JSON documents
https://josdejong.github.io/jsonrepair/
Other
541 stars 33 forks source link

Python Integration #84

Closed rouge-humanity closed 1 year ago

rouge-humanity commented 1 year ago

I have been using JSONRepair via a Linux command line. I would like to use it with python, however, I have not been able to despite various attempts. Could you provide some guidance?

Thank you in advance

josdejong commented 1 year ago

You can execute any command line application from Python using subprocess, for example:

import subprocess

broken_file = 'broken.json'

# if jsonrepair is installed globally via `npm install -g jsonrepair`
repaired_json = subprocess.check_output(['jsonrepair', broken_file], shell=True).decode('UTF-8')

# or, if jsonrepair is installed locally:
# repaired_json = subprocess.check_output(['node', './node_modules/jsonrepair/bin/cli.js', broken_file], shell=True).decode('UTF-8')

print(repaired_json)

In this case it requires jsonrepair to be installed globally via npm install -g jsonrepair but you could also refer to a local installed jsonrepair at something like ./node_modules/jsonrepair/bin/.

adonig commented 1 year ago

I tried to use jsii but couldn't get it fully working. You can have a look at my fork. After some small changes I got it to generate a Python library which I can successfully install in a virtual environment but when I try to import jsonrepair I get the error below. Maybe we can get it working somehow, because then it would be possible to generate jsonrepair for .NET, Golang, Java and Python.

(.venv) asd@mbp jsonrepair-3.0.1 % python
Python 3.11.0 (main, Dec 13 2022, 10:28:32) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import jsonrepair
jsii.errors.JavaScriptError: 
  @jsii/kernel.Fault: Error for package tarball /var/folders/tr/vvds4q2n7rz7zcy882vzlry80000gn/T/tmprbv53lmajsonrepair@3.0.1.jsii.tgz: Expected to find .jsii file in /var/folders/tr/vvds4q2n7rz7zcy882vzlry80000gn/T/jsii-kernel-aRRq5x/node_modules/jsonrepair, but no such file found
      at Kernel._load (/private/var/folders/tr/vvds4q2n7rz7zcy882vzlry80000gn/T/tmpgg9hkrpl/lib/program.js:7723:27)
      at /private/var/folders/tr/vvds4q2n7rz7zcy882vzlry80000gn/T/tmpgg9hkrpl/lib/program.js:7683:52
      at Kernel._debugTime (/private/var/folders/tr/vvds4q2n7rz7zcy882vzlry80000gn/T/tmpgg9hkrpl/lib/program.js:8406:28)
      at Kernel.load (/private/var/folders/tr/vvds4q2n7rz7zcy882vzlry80000gn/T/tmpgg9hkrpl/lib/program.js:7683:29)
      at KernelHost.processRequest (/private/var/folders/tr/vvds4q2n7rz7zcy882vzlry80000gn/T/tmpgg9hkrpl/lib/program.js:11037:36)
      at KernelHost.run (/private/var/folders/tr/vvds4q2n7rz7zcy882vzlry80000gn/T/tmpgg9hkrpl/lib/program.js:10997:22)
      at /private/var/folders/tr/vvds4q2n7rz7zcy882vzlry80000gn/T/tmpgg9hkrpl/lib/program.js:16870:10
      at Object.<anonymous> (/private/var/folders/tr/vvds4q2n7rz7zcy882vzlry80000gn/T/tmpgg9hkrpl/lib/program.js:16871:3)
      at Module._compile (node:internal/modules/cjs/loader:1218:14)
      at Module._extensions..js (node:internal/modules/cjs/loader:1272:10)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/asd/Downloads/jsonrepair-3.0.1/.venv/lib/python3.11/site-packages/jsonrepair-3.0.1-py3.11.egg/jsonrepair/__init__.py", line 201, in <module>
  File "/Users/asd/Downloads/jsonrepair-3.0.1/.venv/lib/python3.11/site-packages/jsonrepair-3.0.1-py3.11.egg/jsonrepair/_jsii/__init__.py", line 13, in <module>
  File "/Users/asd/Downloads/jsonrepair-3.0.1/.venv/lib/python3.11/site-packages/jsii-1.72.0-py3.11.egg/jsii/_runtime.py", line 54, in load
    _kernel.load(assembly.name, assembly.version, os.fspath(assembly_path))
  File "/Users/asd/Downloads/jsonrepair-3.0.1/.venv/lib/python3.11/site-packages/jsii-1.72.0-py3.11.egg/jsii/_kernel/__init__.py", line 301, in load
    self.provider.load(LoadRequest(name=name, version=version, tarball=tarball))
  File "/Users/asd/Downloads/jsonrepair-3.0.1/.venv/lib/python3.11/site-packages/jsii-1.72.0-py3.11.egg/jsii/_kernel/providers/process.py", line 352, in load
    return self._process.send(request, LoadResponse)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/asd/Downloads/jsonrepair-3.0.1/.venv/lib/python3.11/site-packages/jsii-1.72.0-py3.11.egg/jsii/_kernel/providers/process.py", line 339, in send
    raise JSIIError(resp.error) from JavaScriptError(resp.stack)
jsii.errors.JSIIError: Error for package tarball /var/folders/tr/vvds4q2n7rz7zcy882vzlry80000gn/T/tmprbv53lmajsonrepair@3.0.1.jsii.tgz: Expected to find .jsii file in /var/folders/tr/vvds4q2n7rz7zcy882vzlry80000gn/T/jsii-kernel-aRRq5x/node_modules/jsonrepair, but no such file found
adonig commented 1 year ago

The missing .jsii file already has an issue here.

josdejong commented 1 year ago

I have no clue about jsii. Does it also support functions or only classes? I only see classes mentioned, and jsonrepair is a function.

adonig commented 1 year ago

@josdejong You are right. It looks like it explicitly only supports classes. So it would probably be necessary to turn the jsonrepair prototype into a class, which is a bit of work, but on the other hand it would allow lots of people using different languages to repair JSON without having to either port this library to another language or run it as a subprocess, which isn't possible in some environments.

rouge-humanity commented 1 year ago

If it is possible I would like to take a crack in helping do that. I think it would be great to be able to integrate this into code itself with other languages. Especially from a data science perspective.

josdejong commented 1 year ago

Would be nice to work out an experiment to see how to get this working in other environments.

I have to say, I have no idea how mainstream jsii is (compared to executing a command line script or other jsii alternatives). Are there other common ways to do the plumbing between JavaScript code and other languages?

adonig commented 1 year ago

I know jsii because Amazon created it to make the AWS CDK, which works really well.

What I like about jsii is that it comes with a reference that includes a detailed specification on what people have to do to add support for more languages. Because of Amazon's backing I expect jsii to be well maintained for at least the next five years and it should support more languages than .NET, Golang, Python and Java in the future, for example there are already open issues for Ruby and Rust support. Also the Java target pretty much allows any language running in the JVM to use a library implemented with jsii.

There is also a list of more transpilers here but it seems they forgot jsii.

adonig commented 1 year ago

I ported jsonrepair to Python because I needed a quick solution, see this gist. The test suite passes 100% but better don't rely on the correctness of the code. I also won't update the code in the future, because we should instead focus on generating the code for other languages.

josdejong commented 5 months ago

See also: