scrapy / parsel

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors
BSD 3-Clause "New" or "Revised" License
1.11k stars 137 forks source link

Add JSONPath support #204

Open Gallaecio opened 3 years ago

Gallaecio commented 3 years ago

From @Granitosaurus at https://github.com/scrapy/parsel/issues/25#issuecomment-727887878:

Not to derail this but I'd argue that implementing JSONpath[1] would actually be more fitting for parsel as it is xpath like. For example Jmespath doesn't support recursive queries (like //node xpath) while Jsonpath does (as $..node); also the whole protocol structure is much more similar to that of xpath.

Ideally it would be great to have both! More and more web is using json and would be great to have one good parser for both html and json.

1 - https://github.com/h2non/jsonpath-ng jsonpath implementation in Python

Granitosaurus commented 3 years ago

I could add jsonpath support when/if #181 gets merged.

deepakdinesh1123 commented 2 years ago

Should adding support for JSONPath be put on hold until #181 gets merged or can I work on it right now?

Gallaecio commented 2 years ago

I think it should wait, because #181 paves the way making internal changes needed for it.

It could be implemented cherry-picking those internal changes. And if someone wants to push JSONPath to be implemented as soon as possible, I am OK with that approach.

EchoShoot commented 2 years ago

I think Jmespath should be supported first, because it has been actively maintained over the years, and has plenty of resources and documentation. Many developers can find a way to get started. Then we can wait for a better and more robust json parser to appear. This doesn't conflict, just like css doesn't conflict with xpath, both are supported by parsel at the same time.