Closed qishipengqsp closed 1 year ago
General comments
- start_time/startTime and end_time/endTime are used inconsistently both in the patterns and in the text
Done
- the name for the ids for persons p1 (and p2) is inconsitent: "p1Id", "pid1" are used.
patterns: when having multiple nodes on one end of an edge, it is worth assigning different names to these nodes good examples:
- complex-read / 6: has src1, ..., src3
- complex-read / 11: has l1c1, l1c2, ..., l1c3, and so on
bad examples:
- simple-read / 2: has dst, dst, dst
- simple-read / 3: has src, src, ...
- complex-read / 7: has src/dst repeated
- complex-read / 8: has dst1, dst1, ..., dst1; dst2, ...
- complex-read / 9: has up/down repeated
- complex-read / 10: has m repeated
Done
Comments for specific queries/operations
complex-read / 2:
- "accounts" should be singular (account)
Done
complex-read / 5:
- if every row in the result encodes a path, the name of the result attribute should be "path" and its description should be "a transfer trace"
Done
complex-read / 7:
- numSrc, numDst: are these numbers counted with 'count distinct'?
Done
complex-read / 8:
- the term "inflow" is not defined and it is only used for describing the results
- the pattern should illustrate/describe that the traversal can end at any level (dst1, dst2, dst3)
Done
complex-read / 10:
- what should be the result if there are no matches for edge1 and/or edge2?
Done
complex-read / 12:
- nitpicking but if everything else query used the terms edge1, edge2, ...; why use g1, g2, ..., gN here?
- a person can have multiple loans, maybe this should be reflected in the pattern as well?
Done
simple-read / 2: in the pattern:
- the undirected "edge2" in the middle does not make sense to me
- the second "edge1" should be called "edge2"
- in the pattern, "sum(edge1.amount)" in the last row should be "sum(edge2.amount)"
Done
simple-read / 3:
- what should be returned if there are no edges between src and dst and the result would be 0/0?
Done
simple-read / 6:
- result description doesn't make sense. accounts on their own do not have srcs, their incoming transfer edges do.
Done
simple-read / 6:
- the RESULT in the pattern should say "collect(x.id)" instead of "collect(x)"
Done
write / 3, 8, 9:
- the edges (transfer, deposit, repay) edges are missing the plus sign indicating that they are newly created
Done
simple-read / 6, 7, 8:
- the name of result fields that return the collection of multiple IDs should have a plural name, e.g. "accId" should be called "accIds"
Done
write / 8:
- the node "Loan" should be named "loan: Loan" as its bound to a single node by the loanId parameter
- "amount <- amount" in the pattern is inconsistent, other patterns use "amount <- amt"
Done
Typo
The word "gurantee" is misspelled in the patterns of read-write / 3 and write / 13 (maybe also in other places).
Done
TODO: re-generate and crop the diagrams.
General comments
patterns: when having multiple nodes on one end of an edge, it is worth assigning different names to these nodes
good examples:
bad examples:
Comments for specific queries/operations
complex-read / 2:
complex-read / 5:
complex-read / 7:
complex-read / 8:
complex-read / 10:
complex-read / 12:
simple-read / 2: in the pattern:
simple-read / 3:
simple-read / 6:
simple-read / 6:
write / 3, 8, 9:
simple-read / 6, 7, 8:
write / 8:
Typo
The word "gurantee" is misspelled in the patterns of read-write / 3 and write / 13 (maybe also in other places).
currentTime
Several queries use the "currentTime" parameter. Where does the value of this parameter come from: the data set, the driver (e.g. Java's
System.currentTimeMillis()
), or the DBMS (e.g. Cypher'sdate()
function)? If it's a parameter, the patterns should indicate it as such with${currentTime}
.Path queries with conditions over multiple edges
Queries that require "all timestamps in the transfer trace are in ascending order" or the "upstream" edge are difficult to explain in plain Cypher (or GQL or SQL/PGQ) because they require some memory.
One can think of this as an extra choke point which requires support for the category of queries "Regular expression with memory" as described in this paper: https://dl.acm.org/doi/abs/10.1145/2274576.2274585. Even if it's not defined as a choke point, it's worth including this reference in the specificiation.
More comments