phiresky / ripgrep-all

rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc.
Other
6.4k stars 148 forks source link

Searching `.snagx` file extension #233

Closed Hacksore closed 1 week ago

Hacksore commented 2 weeks ago

I'd like to be able to search over .snagx (zip) files to find certain metadata in them.

Me attempting without luck.

 $ rga "Version"
test.json.zip
test.json:   "Version": "1"

 $ rga "Version" 2024-06-19_19-58-35.snagx
binary file matches (found "\0" byte around offset 5)

anatomy of a .snagx file:

 $ file 2024-06-19_19-58-35.snagx
2024-06-19_19-58-35.snagx: Zip archive data, at least v2.0 to extract, compression method=store

What the achive looks like.

├── index.json
├── metadata.json
├── thumbnail.png
├── {5839C95A-137D-4262-87C1-32DEB7C43D30}.backup.json
├── {5839C95A-137D-4262-87C1-32DEB7C43D30}.json
└── {5839C95A-137D-4262-87C1-32DEB7C43D30}.png

Seems .zip file extension is looked for explicitly but wondering is there a workaround without me having to rename the file extension?

 $ rga "Version"
2024-06-19_19-58-35.zip
index.json:   "Version" : "1.0"
{5839C95A-137D-4262-87C1-32DEB7C43D30}.json:   "SoftwareVersion" : "2024.2.5",
{5839C95A-137D-4262-87C1-32DEB7C43D30}.json:   "Version" : "1.0"
{5839C95A-137D-4262-87C1-32DEB7C43D30}.backup.json:   "SoftwareVersion" : "2024.2.5",
{5839C95A-137D-4262-87C1-32DEB7C43D30}.backup.json:   "Version" : "1.0"
metadata.json:   "AppVersion" : "0.1.0 (1)",
metadata.json:   "OperatingSystemVersion" : "macOS 14.5.0",
metadata.json:   "Version" : "1.0",
lafrenierejm commented 1 week ago

Seems .zip file extension is looked for explicitly but wondering is there a workaround without me having to rename the file extension?

Renaming the file is currently the only workaround.

The extensions that are treated as Zip archives is currently hardcoded. There have been some recent PRs (e.g. #208, #213) adding additional extensions. The author has indicated that they would accept a PR making this list user-configurable.

Hacksore commented 1 week ago

Thanks for the pointers @lafrenierejm, I've raised https://github.com/phiresky/ripgrep-all/issues/233 so let see if this is acceptable.

Hopefully this short term solution is accepted and in the future someone can create a user customizable configuration implementation.