phiresky / ripgrep-all

rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc.
Other
6.52k stars 153 forks source link

Feature request: --json output (ideally the same as rg) #78

Open simonw opened 3 years ago

simonw commented 3 years ago

I built a web interface for rg which works by shelling out to a rg --json process. Here's a demo: https://ripgrep.datasette.io/-/ripgrep?pattern=%22sqlite-utils%5B%22%3E%5D&glob=setup.py

It would be neat to be able to use this with rga as well. Any plans to add a rga --json flag similar to the one in rg?

phiresky commented 3 years ago

Great idea. In general --json should already work with rga, same as with ripgrep. The only issue is that both the inner filenames (e.g. hello.txt within a .zip file) as well as the line number prefix (e.g. Page XYZ in a pdf) will appear within the search result instead of as metadata in the json object.

Off the top of my head, I'm not sure if this is easily possible to create cleanly since ripgrep doesn't have any method for passing a side-channel of metadata through the preprocessing process.

One method would be to insert the metadata into the results stream directly in rga-preproc, then postprocess the output of rg --json in rga and extract that metadata again. That would be pretty hacky though, so I'm not sure if I'd want to do it.

phiresky commented 7 months ago

Potentially the best we could do is to output json-in-json (like {"foo":"{\\"foo\\":\\"bar\\"}"}), because ripgrep itself assumes textual output of the preprocessors. Not very satisfying..