elixir-ecto / ecto

A toolkit for data mapping and language integrated query.
https://hexdocs.pm/ecto
Apache License 2.0
6.2k stars 1.44k forks source link

Inconsistent fragment variables when navigating a json path #4055

Closed pmargreff closed 2 years ago

pmargreff commented 2 years ago

Elixir version

1.14.1

Database and Version

PostgreSQL 14

Ecto Versions

ecto 3.9.2

Database Adapter and Versions (postgrex, myxql, etc)

postgrex 0.16.5

Current behavior

The fragment generates different queries depending on how it is written the first two ways of query create a query as expected:

using a simple string as the fragment:

  Book
  |> where([b], fragment("(meta -> 'formats' -> 'digital')::boolean = false"))
  |> Repo.all()

# SELECT b0."id", b0."title", b0."meta" FROM "books" AS b0 WHERE ((meta -> 'formats' -> 'digital')::boolean = false) []

passing the field as a variable:

  Book
  |> where([b], fragment("(? -> 'formats' -> 'digital')::boolean = false", b.meta))
  |> Repo.all()

# SELECT b0."id", b0."title", b0."meta" FROM "books" AS b0 WHERE ((meta -> 'formats' -> 'digital')::boolean = false) []

However, if I try to pass the field and the path as a variable it generates a different string (and that returns wrong results)

pass field and path as variables:

  Book
  |> where([b], fragment("(? -> ?)::boolean = false", b.meta, "'formats' -> 'digital'"))
  |> Repo.all()

# SELECT b0."id", b0."title", b0."meta" FROM "books" AS b0 WHERE ((b0."meta" -> '''formats'' -> ''digital''')::boolean = false) []

it looks like it is adding extra single quotes in the navigation path.

Expected behavior

using multiple variables in a fragment consistently creates the same query

  Book
  |> where([b], fragment("(? -> ?)::boolean = false", b.meta, "'formats' -> 'digital'"))
  |> Repo.all()

# SELECT b0."id", b0."title", b0."meta" FROM "books" AS b0 WHERE ((meta -> 'formats' -> 'digital')::boolean = false) []

I created a simplified repo with some examples of the issue. After cloned, that can be run with the following commands: mix deps.get, mix ecto.create, mix ecto.migrate, mix test.

TBH, I'm not 100% about these 3 Ecto constructs being equivalents. If that behavior is by design, can we improve the docs helping to explicit these outputs?

josevalim commented 2 years ago

This is expected. If you pass a string to the fragment, it will be emitted as a string in the interpolation query. If we didn't so, if would be open to SQL injection attacks. You could try to interpolate each field where([b], fragment("(? -> ? -> ?)::boolean = false", b.meta, "formats", "digital") and it might work (I can't recall). Alternatively, try to pass b.meta["formats"]["digital"], which will work if meta is declared as map/embed.

josevalim commented 2 years ago

Btw, a pull request to improve the docs would be welcome, if the above makes sense. :)

pmargreff commented 2 years ago

If you pass a string to the fragment, it will be emitted as a string in the interpolation query. If we didn't so, if would be open to SQL injection attacks.

While I do understand that, my intention here is exactly searching the JSON dynamically. I discovered this behavior while refactoring the code on my way to make it dynamic.

That said, is there a way out where I can use a dynamic path for searching within the same query? Either using an access behavior style or a variable would work in my case.

We are using Postgres with a considerable amount of unstructured data, having fragments working with dynamic queries is a crucial part of keeping it composable...

greg-rychlewski commented 2 years ago

Does https://hexdocs.pm/ecto/Ecto.Query.API.html#json_extract_path/2 work for you instead of a fragment?

pmargreff commented 2 years ago

I totally lost that as an option because reading the docs it only show uses cases for select. But yeah, it works with static values:

  Book
  |> where([b], json_extract_path(b.meta, ["formats", "digital"]) = false)
  |> Repo.all()

I'll probably work my way through macros to make it work with a function instead of values like that:

    Book
    |> where([b], json_extract_path(b.meta, dynamic_values()) = false)
    |> Repo.all() 

show me the following error: ** (Ecto.Query.CompileError) expected JSON path to be compile-time list, got: '^dynamic_values()' (ecto 3.9.2) expanding macro: Ecto.Query.where/3

Thank you for the suggestion. I'll follow up with some docs about other ways to use json_extract_path other than a select ...

greg-rychlewski commented 2 years ago

I'll investigate if I can alter it to allow runtime lists.

greg-rychlewski commented 2 years ago

@pmargreff You can try it out in master now :). Thanks for the suggestion.

pmargreff commented 2 years ago

It worked like a charm, thank you Greg!