getodk / central-backend

Node.js based backend for ODK Central
https://docs.getodk.org/central-intro/
Apache License 2.0
50 stars 76 forks source link

Intermittent AssertionError in testing #596

Open matthew-white opened 2 years ago

matthew-white commented 2 years ago

The following test fails infrequently. I've seen this failure locally, and it's also been seen on CircleCI:

  1) api: /forms/:id/submissions
       .csv.zip GET
         should return worker-processed consolidated client audit log attachments:

      AssertionError: expected 'instance ID,event,node,start,end,latitude,longitude,accuracy,old-value,new-value\none,e,/data/e,2000-01-01T00:11,,,,,hh,ii\none,d,/data/d,2000-01-01T00:10,,10,11,12,gg,\none,c,/data/c,2000-01-01T00:03,2000-01-01T00:04,7,8,9,ee,ff\none,b,/data/b,2000-01-01T00:02,2000-01-01T00:03,4,5,6,cc,dd\none,a,/data/a,2000-01-01T00:01,2000-01-01T00:02,1,2,3,aa,bb\ntwo,h,/data/h,2000-01-01T00:06,2000-01-01T00:07,-5,-6,,ee,ff\ntwo,g,/data/g,2000-01-01T00:05,2000-01-01T00:06,-3,-4,,cc,dd\ntwo,f,/data/f,2000-01-01T00:04,2000-01-01T00:05,-1,-2,,aa,bb\n' to be 'instance ID,event,node,start,end,latitude,longitude,accuracy,old-value,new-value\none,a,/data/a,2000-01-01T00:01,2000-01-01T00:02,1,2,3,aa,bb\none,b,/data/b,2000-01-01T00:02,2000-01-01T00:03,4,5,6,cc,dd\none,c,/data/c,2000-01-01T00:03,2000-01-01T00:04,7,8,9,ee,ff\none,d,/data/d,2000-01-01T00:10,,10,11,12,gg,\none,e,/data/e,2000-01-01T00:11,,,,,hh,ii\ntwo,f,/data/f,2000-01-01T00:04,2000-01-01T00:05,-1,-2,,aa,bb\ntwo,g,/data/g,2000-01-01T00:05,2000-01-01T00:06,-3,-4,,cc,dd\ntwo,h,/data/h,2000-01-01T00:06,2000-01-01T00:07,-5,-6,,ee,ff\n'
      + expected - actual

       instance ID,event,node,start,end,latitude,longitude,accuracy,old-value,new-value
      +one,a,/data/a,2000-01-01T00:01,2000-01-01T00:02,1,2,3,aa,bb
      +one,b,/data/b,2000-01-01T00:02,2000-01-01T00:03,4,5,6,cc,dd
      +one,c,/data/c,2000-01-01T00:03,2000-01-01T00:04,7,8,9,ee,ff
      +one,d,/data/d,2000-01-01T00:10,,10,11,12,gg,
       one,e,/data/e,2000-01-01T00:11,,,,,hh,ii
      -one,d,/data/d,2000-01-01T00:10,,10,11,12,gg,
      -one,c,/data/c,2000-01-01T00:03,2000-01-01T00:04,7,8,9,ee,ff
      -one,b,/data/b,2000-01-01T00:02,2000-01-01T00:03,4,5,6,cc,dd
      -one,a,/data/a,2000-01-01T00:01,2000-01-01T00:02,1,2,3,aa,bb
      +two,f,/data/f,2000-01-01T00:04,2000-01-01T00:05,-1,-2,,aa,bb
      +two,g,/data/g,2000-01-01T00:05,2000-01-01T00:06,-3,-4,,cc,dd
       two,h,/data/h,2000-01-01T00:06,2000-01-01T00:07,-5,-6,,ee,ff
      -two,g,/data/g,2000-01-01T00:05,2000-01-01T00:06,-3,-4,,cc,dd
      -two,f,/data/f,2000-01-01T00:04,2000-01-01T00:05,-1,-2,,aa,bb

      at Assertion.fail (node_modules/should/cjs/should.js:275:17)
      at Assertion.value (node_modules/should/cjs/should.js:356:19)
      at test/integration/api/submissions.js:1799:51
      at runMicrotasks (<anonymous>)
      at processTicksAndRejections (internal/process/task_queues.js:95:5)
      at async Object.transaction (node_modules/slonik/dist/src/connectionMethods/transaction.js:22:24)
      at async Object.createConnection (node_modules/slonik/dist/src/factories/createConnection.js:96:18)

I ran make test on CircleCI 200 times, and this error came up just once (in this build). The test failure is infrequent, but since it's been seen on a variety of machines, I thought I'd go ahead and file an issue. I think it's interesting that it only seems to be this test that has this ordering issue.

alxndrsn commented 2 years ago

@matthew-white does the order of this exported data matter, or can we just relax the test assertion?

matthew-white commented 2 years ago

We discussed this in our meeting earlier today. It looks like client audit log entries are ordered according to their associated submission def, but their order isn't specified within the submission def:

https://github.com/getodk/central-backend/blob/78a75c0dce31b4b897fbcae81d64b6b3ee7628cd/lib/model/query/client-audits.js#L25-L41

It seems potentially useful to users for client audit log entries to consistently be ordered by start, so I think we can just add that column to the ORDER BY clause of the query above. That should also prevent this intermittent test failure. There may be ties on start among client audit log entries, but it doesn't seem like that should be an issue.

matthew-white commented 2 years ago

I took a look at this, and it looks like start is actually a text column. We could sort by it, but it would sort as a string, not a timestamp.

And actually, I'm not sure that start really is a timestamp. It's a timestamp in Backend tests, but the XForms spec indicates that it's a number. When I open a client audit log from the QA server (for example, here), it does look like start is a number.

If that's the case, then if we sort by start, it'll sort as a string, not a number. But for any single submission, start will probably be a consistent number of digits, so the result should be the same either way. I'll plan to sort by start unless there's a preference for a different approach.

One other thing that I wanted to note is that it looks like there's an index on start. We don't currently use that index as far as I can see, and I doubt that it'd be used even after this change. Maybe that's something to consider removing? At the same time, I wonder if we should add an index on client_audits."blobId". I imagine that some servers will have a large number of client audit log entries, so it'd be helpful to have that index on them. I won't make that change as part of this work, but I thought I'd throw out that thought as @alxndrsn continues thinking about performance.

alxndrsn commented 1 year ago

Seen today at https://app.circleci.com/pipelines/github/getodk/central-backend/1532/workflows/1e78232e-4366-4dbc-9236-2db04ccb1bc5/jobs/2371

alxndrsn commented 1 year ago
  1) api: /forms/:id/submissions
       .csv.zip GET 
         should return worker-processed consolidated client audit log attachments:

      AssertionError: expected 'instance ID,event,node,start,end,latitude,longitude,accuracy,old-value,new-value\none,e,/data/e,2000-01-01T00:11,,,,,hh,ii\none,d,/data/d,2000-01-01T00:10,,10,11,12,gg,\none,c,/data/c,2000-01-01T00:03,2000-01-01T00:04,7,8,9,ee,ff\none,b,/data/b,2000-01-01T00:02,2000-01-01T00:03,4,5,6,cc,dd\none,a,/data/a,2000-01-01T00:01,2000-01-01T00:02,1,2,3,aa,bb\ntwo,h,/data/h,2000-01-01T00:06,2000-01-01T00:07,-5,-6,,ee,ff\ntwo,g,/data/g,2000-01-01T00:05,2000-01-01T00:06,-3,-4,,cc,dd\ntwo,f,/data/f,2000-01-01T00:04,2000-01-01T00:05,-1,-2,,aa,bb\n' to be 'instance ID,event,node,start,end,latitude,longitude,accuracy,old-value,new-value\none,a,/data/a,2000-01-01T00:01,2000-01-01T00:02,1,2,3,aa,bb\none,b,/data/b,2000-01-01T00:02,2000-01-01T00:03,4,5,6,cc,dd\none,c,/data/c,2000-01-01T00:03,2000-01-01T00:04,7,8,9,ee,ff\none,d,/data/d,2000-01-01T00:10,,10,11,12,gg,\none,e,/data/e,2000-01-01T00:11,,,,,hh,ii\ntwo,f,/data/f,2000-01-01T00:04,2000-01-01T00:05,-1,-2,,aa,bb\ntwo,g,/data/g,2000-01-01T00:05,2000-01-01T00:06,-3,-4,,cc,dd\ntwo,h,/data/h,2000-01-01T00:06,2000-01-01T00:07,-5,-6,,ee,ff\n'
      + expected - actual

       instance ID,event,node,start,end,latitude,longitude,accuracy,old-value,new-value
      +one,a,/data/a,2000-01-01T00:01,2000-01-01T00:02,1,2,3,aa,bb
      +one,b,/data/b,2000-01-01T00:02,2000-01-01T00:03,4,5,6,cc,dd
      +one,c,/data/c,2000-01-01T00:03,2000-01-01T00:04,7,8,9,ee,ff
      +one,d,/data/d,2000-01-01T00:10,,10,11,12,gg,
       one,e,/data/e,2000-01-01T00:11,,,,,hh,ii
      -one,d,/data/d,2000-01-01T00:10,,10,11,12,gg,
      -one,c,/data/c,2000-01-01T00:03,2000-01-01T00:04,7,8,9,ee,ff
      -one,b,/data/b,2000-01-01T00:02,2000-01-01T00:03,4,5,6,cc,dd
      -one,a,/data/a,2000-01-01T00:01,2000-01-01T00:02,1,2,3,aa,bb
      +two,f,/data/f,2000-01-01T00:04,2000-01-01T00:05,-1,-2,,aa,bb
      +two,g,/data/g,2000-01-01T00:05,2000-01-01T00:06,-3,-4,,cc,dd
       two,h,/data/h,2000-01-01T00:06,2000-01-01T00:07,-5,-6,,ee,ff
      -two,g,/data/g,2000-01-01T00:05,2000-01-01T00:06,-3,-4,,cc,dd
      -two,f,/data/f,2000-01-01T00:04,2000-01-01T00:05,-1,-2,,aa,bb

      at Assertion.fail (node_modules/should/cjs/should.js:275:17)
      at Assertion.value (node_modules/should/cjs/should.js:356:19)
      at /odk/backend/test/integration/api/submissions.js:1830:51
      at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
      at async Object.transaction (node_modules/slonik/dist/src/connectionMethods/transaction.js:22:24)
      at async Object.createConnection (node_modules/slonik/dist/src/factories/createConnection.js:97:18)