simonw / datasette

An open source multi-tool for exploring and publishing data
https://datasette.io
Apache License 2.0
9.41k stars 671 forks source link

"Query parameters" form shows wrong input fields if query contains "03:31" style times #1421

Closed j4mie closed 3 years ago

j4mie commented 3 years ago

Datasette version 0.58.1.

I'm guessing this is a bug in the code that looks for :param-style query parameters..

image
simonw commented 3 years ago

Urgh, yeah I've seen this one before. Fixing it pretty much requires writing a full SQLite SQL syntax parser in Python, which is frustratingly complicated for solving this issue!

You can work around this for a canned query by using the optional params: argument documented here: https://docs.datasette.io/en/stable/sql_queries.html#canned-query-parameters

simonw commented 3 years ago

Marking this blocked because I don't have a way around the needing-a-SQLite-SQL-parser problem at the moment.

simonw commented 3 years ago

I may have a way to work around this, using explain. Consider this query:

select * from facetable
where state = :state
and on_earth = :on_earth
and neighborhood not like '00:04'

Datasette currently gets confused and shows three form fields: https://latest.datasette.io/fixtures?sql=select+*+from+facetable%0D%0Awhere+state+%3D+%3Astate%0D%0Aand+on_earth+%3D+%3Aon_earth%0D%0Aand+neighborhood+not+like+%2700%3A04%27&state=&on_earth=&04=

fixtures__select___from_facetable_where_state____state_and_on_earth____on_earth_and_neighborhood_not_like__00_04__and_pyinfra_pip_py_at_current_·_Fizzadar_pyinfra

But... if I run explain against that I get this (truncated):

addr opcode p1 p2 p3 p4 p5 comment
20 ResultRow 6 10 0   0  
21 Next 0 3 0   1  
22 Halt 0 0 0   0  
23 Transaction 0 0 35 0 1  
24 Variable 1 2 0 :state 0  
25 Variable 2 3 0 :on_earth 0  
26 String8 0 4 0 00:04 0  
27 Goto 0 1 0   0  

Could it be as simple as pulling out those Variable rows to figure out the names of the variables in the query?

simonw commented 3 years ago

I hoped this would work:

with foo as (
  explain select * from facetable
  where state = :state
  and on_earth = :on_earth
  and neighborhood not like '00:04'
)
select p4 from foo where opcode = 'Variable'

But sadly it returns an error:

near "explain": syntax error

simonw commented 3 years ago

Relevant code: https://github.com/simonw/datasette/blob/ad90a72afa21b737b162e2bbdddc301a97d575cd/datasette/views/database.py#L225-L231

simonw commented 3 years ago

This may not work:

ERROR: sql = 'explain select 1 + :one + :two', params = None: You did not supply a value for binding 1.

The explain queries themselves want me to pass them parameters.

I could try using the regex to pull out candidates and passing None for each of those, including incorrect ones like :31.

simonw commented 3 years ago

I think this works!

_re_named_parameter = re.compile(":([a-zA-Z0-9_]+)")

async def derive_named_parameters(db, sql):
    explain = 'explain {}'.format(sql.strip().rstrip(";"))
    possible_params = _re_named_parameter.findall(sql)
    try:
        results = await db.execute(explain, {p: None for p in possible_params})
        return [row["p4"].lstrip(":") for row in results if row["opcode"] == "Variable"]
    except sqlite3.DatabaseError:
        return []
simonw commented 3 years ago

One catch with this approach: if the SQL query is invalid, the parameters will not be extracted and shown as form fields.

Maybe that's completely fine? Why display a form if it's going to break when the user actually runs the query?

But it does bother me. I worry that someone who is still iterating on and editing their query before actually starting to use it might find the behaviour confusing.

So maybe if the query raises an exception it could fall back on the regular expression results?

simonw commented 3 years ago

Fixed! Fantastic, this one has been bothering me for years.

https://latest.datasette.io/fixtures?sql=select+*+from+facetable%0D%0Awhere+state+%3D+%3Astate%0D%0Aand+on_earth+%3D+%3Aon_earth%0D%0Aand+neighborhood+not+like+%2700%3A04%27

fixtures__select___from_facetable_where_state____state_and_on_earth____on_earth_and_neighborhood_not_like__00_04__and_pyinfra_pip_py_at_current_·_Fizzadar_pyinfra
simonw commented 3 years ago

SQLite carries a warning about using EXPLAIN like this: https://www.sqlite.org/lang_explain.html

The output from EXPLAIN and EXPLAIN QUERY PLAN is intended for interactive analysis and troubleshooting only. The details of the output format are subject to change from one release of SQLite to the next. Applications should not use EXPLAIN or EXPLAIN QUERY PLAN since their exact behavior is variable and only partially documented.

I think that's OK here, because of the regular expression fallback. If the format changes in the future in a way that breaks the query the error should be caught and the regex-captured parameters should be returned instead.

Hmmm... actually that's not entirely true:

https://github.com/simonw/datasette/blob/b1fed48a95516ae84c0f020582303ab50ab817e2/datasette/utils/__init__.py#L1084-L1091

If the format changes such that the same columns are returned but the [row["p4"].lstrip(":") for row in results if row["opcode"] == "Variable"] list comprehension returns an empty array it will break Datasette!

I'm going to take that risk for the moment, but I'll actively watch out for problems in the future. If this does turn out to be bad I can always go back to the pure regular expression mechanism.

simonw commented 3 years ago

Amusing edge-case: if you run this against a explain ... query it falls back to using regular expressions, because explain explain select ... is invalid SQL. https://latest.datasette.io/fixtures?sql=explain+select+*+from+facetable%0D%0Awhere+state+%3D+%3Astate%0D%0Aand+on_earth+%3D+%3Aon_earth%0D%0Aand+neighborhood+not+like+%2700%3A04%27&state=&on_earth=

HaveF commented 3 months ago

If the format changes such that the same columns are returned but the [row["p4"].lstrip(":") for row in results if row["opcode"] == "Variable"] list comprehension returns an empty array it will break Datasette!

I suspect this has problem in sqlite 3.46.0(not works in datasette 0.64.6, and also not works on 1.0a13), after I reinstall sqlite 3.45.3, it works as normal.

the error happens in 3.46.0.

500 Internal Server Error
Traceback (most recent call last):
  File "/Users/mac/.local/pipx/venvs/datasette/lib/python3.12/site-packages/datasette/app.py", line 1668, in route_path
    response = await view(request, send)
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mac/.local/pipx/venvs/datasette/lib/python3.12/site-packages/datasette/app.py", line 1835, in async_view_for_class
    return await async_call_with_supported_arguments(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mac/.local/pipx/venvs/datasette/lib/python3.12/site-packages/datasette/utils/__init__.py", line 1021, in async_call_with_supported_arguments
    return await fn(*call_with)
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/mac/.local/pipx/venvs/datasette/lib/python3.12/site-packages/datasette/views/base.py", line 89, in __call__
    return await handler(request, datasette)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mac/.local/pipx/venvs/datasette/lib/python3.12/site-packages/datasette/views/database.py", line 61, in get
    return await QueryView()(request, datasette)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mac/.local/pipx/venvs/datasette/lib/python3.12/site-packages/datasette/views/base.py", line 89, in __call__
    return await handler(request, datasette)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mac/.local/pipx/venvs/datasette/lib/python3.12/site-packages/datasette/views/database.py", line 487, in get
    named_parameters = await derive_named_parameters(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mac/.local/pipx/venvs/datasette/lib/python3.12/site-packages/datasette/utils/__init__.py", line 1147, in derive_named_parameters
    return [row["p4"].lstrip(":") for row in results if row["opcode"] == "Variable"]
            ^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'lstrip'