dpapathanasiou / simple-graph

This is a simple graph database in SQLite, inspired by "SQLite as a document database"
MIT License
1.4k stars 86 forks source link

No Traverse with TEXT ID #10

Closed chhenning closed 3 years ago

chhenning commented 3 years ago

When I use IDs like meta6 the traverse is not working.

print(db.traverse(db_file_name, 'meta6', neighbors_fn=db.find_neighbors))

This only prints ['"meta6"'].

In the atomic(), I enabled tracing via:

connection.set_trace_callback(print)

and the query is incorrect:

PRAGMA foreign_keys = TRUE;
WITH RECURSIVE traverse(id) AS (
  SELECT '"meta6"'
  UNION
  SELECT source FROM edges JOIN traverse ON target = id
  UNION
  SELECT target FROM edges JOIN traverse ON source = id
) SELECT id FROM traverse;

Changing it to

PRAGMA foreign_keys = TRUE;
WITH RECURSIVE traverse(id) AS (
  SELECT 'meta6'
  UNION
  SELECT source FROM edges JOIN traverse ON target = id
  UNION
  SELECT target FROM edges JOIN traverse ON source = id
) SELECT id FROM traverse;

makes it work.

dpapathanasiou commented 3 years ago

This was interesting: the traversal CTE expects a seed value to get started, and that value should ideally be of the same type as the rest of the results (i.e., string since that is how the id is defined).

So I was using json.dumps to convert the input to string, but as you noticed, it behaves quite differently depending on how the string is quoted:

>>> json.dumps(1)
'1'
>>> json.dumps('1')
'"1"'
>>> json.dumps("1")
'"1"'

The first one is what we want, and produces results, whereas the latter two do not, unless the id values actually use double-quotes in their definitions.

One way to resolve it would have been to use str instead:

>>> str(1)
'1'
>>> str("1")
'1'
>>> str('1')
'1'

But I thought a better way to respect the original type would be to use named placeholders (aka 'named style') instead.

Now, both "1" and '1' work the same way, and exactly as expected.

Plain 1 (as an integer) also works, though you may see dupes in the output, since the CTE's UNION ALL keeps them both, since they are strictly speaking different (b/c of their types).

The corresponding test shows some of the possibilities and pitfalls.